raid6: Add LoongArch SIMD recovery implementation

mirror of https://github.com/torvalds/linux.git synced 2025-12-07 20:06:24 +00:00

Similar to the syndrome calculation, the recovery algorithms also work
on 64 bytes at a time to align with the L1 cache line size of current
and future LoongArch cores (that we care about). Which means
unrolled-by-4 LSX and unrolled-by-2 LASX code.

The assembly is originally based on the x86 SSSE3/AVX2 ports, but
register allocation has been redone to take advantage of LSX/LASX's 32
vector registers, and instruction sequence has been optimized to suit
(e.g. LoongArch can perform per-byte srl and andi on vectors, but x86
cannot).

Performance numbers measured by instrumenting the raid6test code, on a
3A5000 system clocked at 2.5GHz:

> lasx  2data: 354.987 MiB/s
> lasx  datap: 350.430 MiB/s
> lsx   2data: 340.026 MiB/s
> lsx   datap: 337.318 MiB/s
> intx1 2data: 164.280 MiB/s
> intx1 datap: 187.966 MiB/s

Because recovery algorithms are chosen solely based on priority and
availability, lasx is marked as priority 2 and lsx priority 1. At least
for the current generation of LoongArch micro-architectures, LASX should
always be faster than LSX whenever supported, and have similar power
consumption characteristics (because the only known LASX-capable uarch,
the LA464, always compute the full 256-bit result for vector ops).

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

This commit is contained in:

WANG Xuerui

2023-09-06 22:53:55 +08:00

committed by

Huacai Chen

parent 8f3f06dfd6

commit f209132104

5 changed files with 525 additions and 2 deletions

									
										2

lib/raid6/test/Makefile
									
												View File
												
				@@ -65,7 +65,7 @@ else ifeq ($(HAS_ALTIVEC),yes)

				        OBJS += altivec1.o altivec2.o altivec4.o altivec8.o \

				                vpermxor1.o vpermxor2.o vpermxor4.o vpermxor8.o

				else ifeq ($(ARCH),loongarch64)

				        OBJS += loongarch_simd.o

				        OBJS += loongarch_simd.o recov_loongarch_simd.o

				endif

				.c.o:

raid6: Add LoongArch SIMD recovery implementation

2 lib/raid6/test/Makefile Unescape Escape View File

2

lib/raid6/test/Makefile

View File