From 5654cd0b5a4b4879a508cf58d463e675d1cb1c17 Mon Sep 17 00:00:00 2001 From: JayDDee Date: Sun, 26 May 2024 12:21:05 -0400 Subject: [PATCH] Updated Support for AArch64 (markdown) --- Support-for-AArch64.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/Support-for-AArch64.md b/Support-for-AArch64.md index f6c5e64..ee0ce15 100644 --- a/Support-for-AArch64.md +++ b/Support-for-AArch64.md @@ -70,12 +70,12 @@ X86_64 operates on lanes 0 & 2 while ARM operates on lanes 0 & 1 of the source d `uint64x2_t = uint32x2_t * uint32x2_t` -Most widening multiplications use the x86_64 format requiring a workaround for ARM. The curent workaround seems to be functioning correctly where needed but with significant extra overhead. +Most widening multiplications use the x86_64 format requiring a workaround for ARM. The curent workaround seems to be functioning correctly where needed but with significant extra overhead. (update: overhead reduced in v24.3) -SHA512 support in cpuminer-opt is not assured. It is little used and may not be worth the effort. X64 looks like amn enlarged clone of sha256 with 128 bit operations replaced with equivalent 256 bit ops. -AArch64 implements sha512 using 128 bit registers and splitting the 256 bit operations over 2 128 bit instructions, reducing performance gain. +SHA512 support in cpuminer-opt is not assured. It is little used and may not be worth the effort. X64 looks like an enlarged clone of sha256 with 128 bit operations replaced with equivalent 256 bit ops. +AArch64 implements sha512 using 128 bit registers and splitting the 256 bit operations over 2 128 bit instructions, complicating implementation and reducing performance gain. -SVE deosn't seem to be useable for hashing. It uses Vector Agnostic Programming which abstracts the logical vector size from the vector register size. This creates run time overhead to determine HW register size that doesn't exist for highly optimised NEON code. +SVE deosn't seem to be useable for hashing. It uses Vector Agnostic Programming which abstracts the logical vector size from the vector register size. This creates run time overhead for SVE to determine HW register size. Biggest MAC problems seem to be with JSON, possibly a configure issue choosing whether to use system or local version of JSON. Other than missing GMP most problems occur at link or load time. \ No newline at end of file