eigen/Eigen at b55b5c7280a0481f01fe5ec764d55c443a8b6496 - eigen - Git: MartinFarm

GitHub-Proxy/eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-07-06 13:15:14 +08:00

History

Sameer Agarwal b55b5c7280 Speed up row-major matrix-vector product on ARM

The row-major matrix-vector multiplication code uses a threshold to
check if processing 8 rows at a time would thrash the cache.

This change introduces two modifications to this logic.

1. A smaller threshold for ARM and ARM64 devices.

The value of this threshold was determined empirically using a Pixel2
phone, by benchmarking a large number of matrix-vector products in the
range [1..4096]x[1..4096] and measuring performance separately on
small and little cores with frequency pinning.

On big (out-of-order) cores, this change has little to no impact. But
on the small (in-order) cores, the matrix-vector products are up to
700% faster. Especially on large matrices.

The motivation for this change was some internal code at Google which
was using hand-written NEON for implementing similar functionality,
processing the matrix one row at a time, which exhibited substantially
better performance than Eigen.

With the current change, Eigen handily beats that code.

2. Make the logic for choosing number of simultaneous rows apply
unifiormly to 8, 4 and 2 rows instead of just 8 rows.

Since the default threshold for non-ARM devices is essentially
unchanged (32000 -> 32 * 1024), this change has no impact on non-ARM
performance. This was verified by running the same set of benchmarks
on a Xeon desktop.

2019-02-01 15:23:53 -08:00

..

Speed up row-major matrix-vector product on ARM

2019-02-01 15:23:53 -08:00

Cholesky

bug #1455 : Cholesky module depends on Jacobi for rank-updates.

2017-08-22 11:37:32 +02:00

CholmodSupport

Update link to suitesparse.

2016-01-27 22:48:40 +01:00

CMakeLists.txt

bug #1167 : simplify installation of header files using cmake's install(DIRECTORY ...) command.

2016-08-29 10:59:37 +02:00

Core

Implement AVX512 vectorization of std::complex<float/double>

2018-12-06 15:58:06 +01:00

Dense

…

Eigen

…

Eigenvalues

Old gcc versions have problems with recursive #pragma GCC diagnostic push/pop

2018-08-28 11:44:15 +02:00

Geometry

Old gcc versions have problems with recursive #pragma GCC diagnostic push/pop

2018-08-28 11:44:15 +02:00

Householder

Add missing licence header to some top header files

2015-10-26 11:46:05 +01:00

IterativeLinearSolvers

Add missing licence header to some top header files

2015-10-26 11:46:05 +01:00

Jacobi

Add missing licence header to some top header files

2015-10-26 11:46:05 +01:00

KLUSupport

Move KLU support to official

2017-11-10 14:11:22 +01:00

LU

use MKL's lapacke.h header when using MKL

2017-08-17 21:58:39 +02:00

MetisSupport

Add missing licence header to some top header files

2015-10-26 11:46:05 +01:00

OrderingMethods

Add missing licence header to some top header files

2015-10-26 11:46:05 +01:00

PardisoSupport

Extend CUDA support to matrix inversion and selfadjointeigensolver

2018-06-11 18:33:24 +02:00

PaStiXSupport

clarify Pastix requirements

2017-11-27 22:11:57 +01:00

QR

Old gcc versions have problems with recursive #pragma GCC diagnostic push/pop

2018-08-28 11:44:15 +02:00

QtAlignedMalloc

bug #1468 (1/2) : add missing std:: to memcpy

2017-09-22 09:23:24 +02:00

Sparse

bug #1392 : fix #include <Eigen/Sparse> with mpl2-only

2017-02-11 10:35:01 +01:00

SparseCholesky

…

SparseCore

bug #1101 : typo

2015-10-30 12:02:52 +01:00

SparseLU

Fix numerous shadow-warnings for GCC<=4.8

2018-08-28 18:32:39 +02:00

SparseQR

Old gcc versions have problems with recursive #pragma GCC diagnostic push/pop

2018-08-28 11:44:15 +02:00

SPQRSupport

Update link to suitesparse.

2016-01-27 22:48:40 +01:00

StdDeque

bug #1389 : MSVC's std containers do not properly align in 64 bits mode if the requested alignment is larger than 16 bytes (e.g., with AVX)

2017-02-03 15:22:35 +01:00

StdList

bug #1389 : MSVC's std containers do not properly align in 64 bits mode if the requested alignment is larger than 16 bytes (e.g., with AVX)

2017-02-03 15:22:35 +01:00

StdVector

bug #1389 : MSVC's std containers do not properly align in 64 bits mode if the requested alignment is larger than 16 bytes (e.g., with AVX)

2017-02-03 15:22:35 +01:00

SuperLUSupport

bug #1119 : Adjust call to ?gssvx for SuperLU 5

2016-07-10 02:29:57 +02:00

SVD

use MKL's lapacke.h header when using MKL

2017-08-17 21:58:39 +02:00

UmfPackSupport

Update link to suitesparse.

2016-01-27 22:48:40 +01:00