12436 Commits

Author SHA1 Message Date
Charles Schlosser
1a2bfca8f0 Fix annoying warnings 2023-07-07 20:19:58 +00:00
Antonio Sánchez
63dcb429cd Fix use of arg function in CUDA. 2023-07-07 18:37:14 +00:00
Marcus Comstedt
8f927fb52e Altivec: fix compilation with C++20 and higher 2023-07-05 13:14:02 +00:00
Kevin Leonardic
d4b05454a7 Fix argument for _mm256_cvtps_ph imm parameter 2023-07-03 13:44:20 +02:00
Charles Schlosser
15ac3765c4 Fix ivcSize return type in IndexedViewMethods.h 2023-07-03 03:49:37 +00:00
Chip Kerchner
3791ac8a1a Fix supportsMMA to obey EIGEN_ALTIVEC_MMA_DYNAMIC_DISPATCH compilation flag and compiler support. 2023-06-28 17:57:21 +00:00
H S Helson Go
bc57b926a0 Add Quaternion constructor from real scalar and imaginary vector 2023-06-27 05:38:17 +00:00
Antonio Sánchez
31cd2ad371 Ensure EIGEN_HAS_ARM64_FP16_VECTOR_ARITHMETIC is always defined on arm. 2023-06-26 19:21:54 +00:00
Antonio Sánchez
7465b7651e Disable FP16 arithmetic for arm32. 2023-06-26 18:39:42 +00:00
Rasmus Munk Larsen
b3267f6936 Remove unused variable in test/svd_common.h. 2023-06-23 23:12:19 +00:00
Chip Kerchner
211c5dfc67 Add optional offset parameter to ploadu_partial and pstoreu_partial 2023-06-23 19:53:05 +00:00
Charles Schlosser
44c20bbbe3 rint round floor ceil 2023-06-23 16:29:16 +00:00
Charles Schlosser
6ee86fd473 delete deprecated function call in svd test 2023-06-23 14:17:27 +00:00
Charles Schlosser
387175c258 Fix safe_abs in int_pow 2023-06-23 04:12:41 +00:00
Charles Schlosser
c6db610bc7 Fix svd test 2023-06-22 17:37:24 +00:00
Charles Schlosser
969c31eefc Fix AVX pstore 2023-06-15 01:47:38 +00:00
wilfried.karel
6c1411e521 define a move constructor for Ref<const...> 2023-06-14 20:10:51 +00:00
wilfried.karel
d8f3eb87bf Compile- and run-time assertions for the construction of Ref<const>. 2023-06-14 15:49:58 +00:00
Charles Schlosser
59b3ef5409 Partially Vectorize Cast 2023-06-09 16:54:31 +00:00
Rasmus Munk Larsen
7d7576f326 Avoid underflow in prsqrt. 2023-06-06 14:06:19 -07:00
Charles Schlosser
b7151ffaab Fix unary pow error handling and test 2023-06-06 18:46:55 +00:00
Rasmus Munk Larsen
7ac8897431 Reduce max relative error of prsqrt from 3 to 2 ulps. 2023-06-04 22:25:33 +00:00
Charles Schlosser
1d80e23186 Optimize scalar_unary_pow_op error handling 2023-06-02 18:53:06 +00:00
Alexander Shaposhnikov
316eab8deb Do not set EIGEN_HAS_ARM64_FP16_SCALAR_ARITHMETIC for cuda compilation 2023-05-31 15:15:06 +00:00
Alejandro Acosta
07e4604b19 Replace usage of CudaStreamDevice with GpuStreamDevice in tensor benchmarks GPU 2023-05-30 15:44:07 +00:00
Rasmus Munk Larsen
8c43bf2b5b Clean up Redux.h and fix vectorization_logic test after changes to traversal order in Redux. 2023-05-24 20:26:52 +00:00
Charles Schlosser
da6a71faf0 Add linear redux evaluators 2023-05-24 17:07:25 +00:00
Charles Schlosser
67a1e881d9 Sparse matrix column/row removal 2023-05-24 17:04:45 +00:00
Rasmus Munk Larsen
de1c884687 Add reference to writeup of approach used in canonicalEulerAngles. 2023-05-24 15:52:26 +00:00
Charles Schlosser
307a417e1c Fix unrolled assignment evaluator 2023-05-22 16:39:24 +00:00
Juraj Oršulić
c18f94e3b0 Geometry/EulerAngles: introduce canonicalEulerAngles 2023-05-19 15:42:22 +00:00
Charles Schlosser
7d9bb90f15 SVD: fix numerous compiler warnings / failures 2023-05-15 16:56:47 +00:00
Rasmus Munk Larsen
2709f4c8fb Use relative path to include EmulateArray.h in CXX11Meta.h, and get rid of redundant meta-programming code, which was moved to Core. 2023-05-09 23:21:35 +00:00
Rasmus Munk Larsen
9a02c977ec Use relative paths to include Meta.h and MaxSizeVector.h in Tensor 2023-05-09 22:07:55 +00:00
Rasmus Munk Larsen
96c42771d6 Make it possible to override the synchonization primitives used by the threadpool using macros. 2023-05-09 19:36:17 +00:00
Rasmus Munk Larsen
1321821e86 Add missing braces in Umeyama.h 2023-05-09 19:10:50 +00:00
Rasmus Munk Larsen
524c329ab2 Work around compiler bug in Umeyama.h. 2023-05-09 18:53:56 +00:00
Charles Schlosser
fbf7189bd5 Fix cuda compilation 2023-05-08 16:15:47 +00:00
Mehdi Goli
0623791930 [SYCL-2020] Enabling USM support for SYCL. SYCL-1.2.1 did not have support for USM. 2023-05-05 17:30:36 +00:00
Andrzej Ciarkowski
1698c367a0 Use std::shared_ptr for FFTW/IMKL FFT plan implementation; Fixes #2651 2023-05-05 16:58:23 +00:00
Antonio Sánchez
1f79a6078f Return NaN in ndtri for values outside valid input range. 2023-05-05 16:27:26 +00:00
Tobias Wood
94f57867fe Thread pool 2023-05-05 16:23:34 +00:00
Charles Schlosser
9eb8e2afba Change array_cwise test name 2023-05-05 03:08:43 +00:00
Charles Schlosser
725c11719b Visitor: fix modulo by zero compiler warning 2023-05-04 18:21:09 +00:00
Chip Kerchner
b8208b363c Specialized loadColData correctly - fix previous BF16 GEMV MR 2023-05-04 16:38:17 +00:00
Charles Schlosser
2af03fb685 clean up array_cwise test 2023-05-04 16:02:08 +00:00
Chip Kerchner
fda1373a15 Fix ColMajor BF16 GEMV for when vector is RowMajor 2023-05-03 20:12:50 +00:00
Charles Schlosser
fdc749de2a JacobiSVD: set m_nonzeroSingularValues to zero if not finite 2023-05-02 17:48:21 +00:00
Chip Kerchner
6418ac0285 Unroll F32 to BF16 loop - 1.8X faster conversions for LLVM. Use vector pairs for GCC. 2023-05-01 16:54:16 +00:00
Pedro Gonnet
874f5947f4 Add half-Packet operations to StridedLinearBufferCopy. 2023-05-01 16:09:31 +00:00