Updated Support for AArch64 (markdown)

2026-07-15 19:36:49 +00:00 · 2023-11-28 02:53:55 -05:00
parent 8443d015b6
commit 9487e773b2
1 changed files with 0 additions and 2 deletions
--- a/Support-for-AArch64.md
+++ b/Support-for-AArch64.md
@@ -68,8 +68,6 @@ Some notable observations about the problems observed:

 Verthash is a mystery, it only produces rejects on ARM even with no targtetted code, only compiled C. The same C source works on x86_64 but not on AArch64. Tried with -O3 & -O2. In all other cases falling back to C was always successful. Verthash data file creation and verification work. Verthash has one unique feature in the data-file. No other algo has that and no other algo fails with unoptimized code.

-There are a few cases where translating from SSE2 to NEON is diffiult or the workaround kills performance. NEON, being RISC, has no microcode so no programmable shuffle instruction. The only shuffling I can find is sub-vector word & sub-word bit, shift, rotate & reverse. Notably SSE2 can't do bit reversal but can shuffle bytes any which way. Notably Groestl AES, despite not working, is currently slower on ARM that the SPH version.
-
 Multiplications are implemented differently, particularly widening multiplcatiom where the product is twice the bit width of the souces.
 X86_64 operates on lanes 0 & 2 while ARM operates on lanes 0 & 1 of the source data. In effect x86_64 assumes the data is pre-widened and discards lanes 1 & 3 leaving 2 zero extended 64 bit source integers. With ARM the source arguments are packed into a smaller vector and the product is widened to 64 bits upon multiplication: