eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-10-22 13:01:06 +08:00

Author	SHA1	Message	Date
b-shi	518fc321cb	AVX512 Optimizations for Triangular Solve	2022-03-16 18:04:50 +00:00
Erik Schultheis	421cbf0866	Replace Eigen type metaprogramming with corresponding std types and make use of alias templates	2022-03-16 16:43:40 +00:00
Rasmus Munk Larsen	9ad5661482	Revert "Fix up PowerPC MMA flags so it builds by default."	2022-03-15 20:51:03 +00:00
Antonio Sánchez	65eeedf964	Fix up PowerPC MMA flags so it builds by default.	2022-03-15 20:22:23 +00:00
Tobias Schlüter	cb1e8228e9	Convert bit calculation to constexpr, avoid casts.	2022-03-13 22:38:36 +09:00
Duncan McBain	a3b64625e3	Remove ComputeCpp-specific code from SYCL Vptr	2022-03-08 22:44:18 +00:00
Rasmus Munk Larsen	0e6f4e43f1	Fix a few confusing comments in psincos_float.	2022-03-04 20:41:49 +00:00
Sean McBride	f1b9692d63	Removed EIGEN_UNUSED decorations from many functions that are in fact used	2022-03-03 20:19:33 +00:00
Antonio Sánchez	9c07e201ff	Modified sqrt/rsqrt for denormal handling.	2022-03-02 17:20:47 +00:00
Antonio Sánchez	19c39bea29	Fix mixingtypes for g++-11.	2022-02-25 19:28:10 +00:00
Rasmus Munk Larsen	8b875dbef1	Changes to fast SQRT/RSQRT	2022-02-23 17:32:21 +00:00
Ramil Sattarov	f9b7564faa	E2K: initial support of LCC MCST compiler for the Elbrus 2000 CPU architecture	2022-02-23 17:07:34 +00:00
Antonio Sánchez	28e008b99a	Fix sqrt/rsqrt for NEON.	2022-02-15 21:31:51 +00:00
Erik Schultheis	7197b577fb	Remove unused macros in AVX packetmath. The following macros are removed: * EIGEN_DECLARE_CONST_Packet8f * EIGEN_DECLARE_CONST_Packet4d * EIGEN_DECLARE_CONST_Packet8f_FROM_INT * EIGEN_DECLARE_CONST_Packet8i	2022-02-14 10:34:23 +00:00
Chip Kerchner	cb5ca1c901	Cleanup compiler warnings, etc from recent changes in GEMM & GEMV for PowerPC	2022-02-09 18:47:08 +00:00
Rasmus Munk Larsen	92d0026b7b	Provide a definition for numeric_limits static data members	2022-02-08 20:34:53 +00:00
Rasmus Munk Larsen	979fdd58a4	Add generic fast psqrt and prsqrt impls and make them correct for 0, +Inf, NaN, and negative arguments.	2022-02-05 00:20:13 +00:00
Antonio Sánchez	4bffbe84f9	Restrict GCC<6.3 maxpd workaround to only gcc.	2022-02-04 22:47:34 +00:00
Antonio Sánchez	e7f4a901ee	Define EIGEN_HAS_AVX512_MATH in PacketMath.	2022-02-04 22:25:52 +00:00
Antonio Sánchez	6b60bd6754	Fix 32-bit arm int issue.	2022-02-04 21:59:33 +00:00
Antonio Sánchez	96da541cba	Fix AVX512 math function consistency, enable for ICC.	2022-02-04 19:35:18 +00:00
Antonio Sánchez	cafeadffef	Fix ODR violations.	2022-02-04 19:01:07 +00:00
Chip Kerchner	66464bd2a8	Fix number of block columns to NOT overflow the cache (PowerPC) abnormally in GEMV	2022-01-27 20:35:53 +00:00
Rasmus Munk Larsen	8f2c6f0aa6	Make preciprocal IEEE compliant w.r.t. 1/0 and 1/inf.	2022-01-26 20:38:05 +00:00
Rasmus Munk Larsen	51311ec651	Remove inline assembly for FMA (AVX) and add remaining extensions as packet ops: pmsub, pnmadd, and pnmsub.	2022-01-26 04:25:41 +00:00
Rasmus Munk Larsen	ea2c02060c	Add reciprocal packet op and fast specializations for float with SSE, AVX, and AVX512.	2022-01-21 23:49:18 +00:00
Ilya Tokar	a0fc640c18	Add support for packets of int64 on x86	2022-01-21 19:55:23 +00:00
Erik Schultheis	970640519b	Cleanup	2022-01-21 01:48:59 +00:00
Chip Kerchner	708fd6d136	Add MMA and performance improvements for VSX in GEMV for PowerPC.	2022-01-13 13:23:18 +00:00
Kolja Brix	8d81a2339c	Reduce usage of reserved names	2022-01-10 20:53:29 +00:00
Matthias Möller	c4b1dd2f6b	Add support for Cray, Fujitsu, and Intel ICX compilers The following preprocessor macros are added: - EIGEN_COMP_CPE and EIGEN_COMP_CLANGCPE version number of the CRAY compiler if Eigen is compiled with the Cray C++ compiler, 0 otherwise. - EIGEN_COMP_FCC and EIGEN_COMP_CLANGFCC version number of the FCC compiler if Eigen is compiled with the Fujitsu C++ compiler, 0 otherwise - EIGEN_COMP_CLANGICC version number of the ICX compiler if Eigen is compiled with the Intel oneAPI C++ compiler, 0 otherwise All three compilers (Cray, Fujitsu, Intel) offer a traditional and a Clang-based frontend. This is distinguished by the CLANG prefix.	2022-01-07 18:46:16 +00:00
Rasmus Munk Larsen	96dc37a03b	Some fixes/cleanups for numeric_limits & fix for related bug in psqrt	2022-01-07 01:10:17 +00:00
Rasmus Munk Larsen	7b5a8b6bc5	Improve plog: 20% speedup for float + handle denormals	2022-01-05 23:40:31 +00:00
Shiva Ghose	a4098ac676	Fix duplicate include guard ALTIVEC_H -> ZVECTOR_H Some some header guards were repeated between the `AltiVec` package and the `ZVector` packages. This could cause a problem if (for whatever reason) someone attempts to include headers for both architectures.	2021-12-31 08:43:24 +00:00
Rasmus Munk Larsen	8eab7b6886	Improve exp<float>(): Don't flush denormal results +4% speedup. 1. Speed up exp(x) by reducing the polynomial approximant from degree 7 to degree 6. With exactly representable coefficients computed by the Sollya tool, this still gives a maximum relative error of 1 ulp, i.e. faithfully rounded, for arguments where exp(x) is a normalized float. This change results in a speedup of about 4% for AVX2. 2. Extend the range where exp(x) returns a non-zero result to from ~[-88;88] to ~[-104;88] i.e. return denormalized values for large negative arguments instead of zero. Compared to exp<double>(x) the denormalized results gradually decrease in accuracy down to 0.033 relative error for arguments around x = -104 where exp(x) is ~std::numeric<float>::denorm_min(). This is expected and acceptable.	2021-12-28 15:00:19 +00:00
Erik Schultheis	dee6428a71	fixed clang warnings about alignment change and floating point precision	2021-12-18 17:18:16 +00:00
Rasmus Munk Larsen	f04fd8b168	Make sure exp(-Inf) is zero for vectorized expressions. This fixes #2385 .	2021-12-08 17:57:23 +00:00
Erik Schultheis	cc11e240ac	Some further cleanup	2021-12-06 18:01:15 +00:00
Erik Schultheis	ec2fd0f7ed	Require recent GCC and MSCV and removed `EIGEN_HAS_CXX14` and some other feature test macros	2021-12-01 00:48:34 +00:00
Rasmus Munk Larsen	5137a5157a	Make numeric_limits members constexpr as per the newer C++ standards. Author: majnemer@google.com	2021-11-19 15:58:36 +00:00
Chip Kerchner	9cf34ee0ae	Invert rows and depth in non-vectorized portion of packing (PowerPC).	2021-10-28 21:59:41 +00:00
Ilya Tokar	e1cb6369b0	Add AVX vector path to float2half/half2float Makes e. g. matrix multiplication 2x faster: name old cpu/op new cpu/op delta BM_convers 181ms ± 1% 62ms ± 9% -65.82% (p=0.016 n=4+5) Tested on all possible input values (not adding tests, since they take a long time).	2021-10-28 13:59:01 -04:00
Antonio Sanchez	e559701981	Fix compile issue for gcc 4.8	2021-10-28 08:23:19 -07:00
Rohit Santhanam	48e40b22bf	Preliminary HIP bfloat16 GPU support.	2021-10-27 18:36:45 +00:00
Antonio Sanchez	40bbe8a4d0	Fix ZVector build. Cross-compiled via `s390x-linux-gnu-g++`, run via qemu. This allows the packetmath tests to pass.	2021-10-27 16:30:15 +00:00
Alex Druinsky	6bb6a6bf53	Vectorize fp16 tanh and logistic functions on Neon Activates vectorization of the Eigen::half versions of the tanh and logistic functions when they run on Neon. Both functions convert their inputs to float before computing the output, and as a result of this commit, the conversions and the computation in float are vectorized.	2021-10-27 16:09:16 +00:00
Andreas Krebbel	8faafc3aaa	ZVector: Move alignas qualifier to come first We currently have plenty of type definitions with the alignment qualifier coming after the type. The compiler warns about ignoring them: int EIGEN_ALIGN16 ai[4]; Turn this into: EIGEN_ALIGN16 int ai[4];	2021-10-26 15:33:47 +02:00
Antonio Sanchez	fd5f48e465	Fix tuple compilation for VS2017. VS2017 doesn't like deducing alias types, leading to a bunch of compile errors for functions involving the `tuple` alias. Replacing with `TupleImpl` seems to solve this, allowing the test to compile/pass.	2021-10-20 19:18:34 +00:00
Antonio Sanchez	d0d34524a1	Move CUDA/Complex.h to GPU/Complex.h, remove TensorReductionCuda.h The `Complex.h` file applies equally to HIP/CUDA, so placing under the generic `GPU` folder. The `TensorReductionCuda.h` has already been deprecated, now removing for the next Eigen version.	2021-10-20 12:00:19 -07:00
Antonio Sanchez	f0f1d7938b	Disable testing of complex compound assignment operators for MSVC. MSVC does not support specializing compound assignments for `std::complex`, since it already specializes them (contrary to the standard). Trying to use one of these on device will currently lead to a duplicate definition error. This is still probably preferable to no error though. If we remove the definitions for MSVC, then it will compile, but the kernel will fail silently. The only proper solution would be to define our own custom `Complex` type.	2021-09-27 15:15:11 -07:00

... 2 3 4 5 6 ...

1270 Commits