4922 Commits

Author SHA1 Message Date
Charles Schlosser
f3929ac7ed Fix EIGEN_HAS_CXX17_OVERALIGN for icc 2023-01-03 17:30:10 +00:00
Arthur
311cc0f9cc Enable NEON pcmp, plset, and complex psqrt 2022-12-22 05:38:34 +00:00
Antonio Sánchez
dbf7ae6f9b Fix up C++ version detection macros and cmake tests. 2022-12-20 18:06:03 +00:00
Antonio Sánchez
bb6675caf7 Fix incorrect NEON native fp16 multiplication. 2022-12-19 20:46:44 +00:00
Arthur Feeney
c4fb6af24b Enable NEON pabs for unsigned int types 2022-12-19 17:07:36 +00:00
Alexander Richardson
37de432907 Avoid using std::raise() for divide by zero 2022-12-14 20:06:16 +00:00
Alexander Richardson
62de593c40 Allow std::initializer_list constructors in constexpr expressions 2022-12-14 17:05:37 +00:00
Charles Schlosser
6d3e3678b4 optimize equalspace packetop 2022-12-13 01:22:25 +00:00
Charles Schlosser
2004831941 add EqualSpaced / setEqualSpaced 2022-12-13 00:54:57 +00:00
Antonio Sánchez
03c9b4738c Enable direct access for NestByValue. 2022-12-07 18:21:45 +00:00
Chip Kerchner
b59f18b4f7 Increase L2 and L3 cache size for Power10. 2022-12-07 18:20:33 +00:00
Lianhuang Li
d194167149 Fix the bug using neon instruction fmla for data type half 2022-12-01 17:28:57 +00:00
Pedro Caldeira
31ab62d347 Add support for Power10 (AltiVec) MMA instructions for bfloat16. 2022-11-30 23:33:37 +00:00
Antonio Sánchez
2260e11eb0 Fix reshape strides when input has non-zero inner stride. 2022-11-29 19:39:29 +00:00
Charles Schlosser
044f3f6234 Fix bug in handmade_aligned_realloc 2022-11-18 22:35:31 +00:00
Charles Schlosser
02805bd56c Fix AVX2 psignbit 2022-11-16 13:43:11 +00:00
Chip Kerchner
399ce1ed63 Fix duplicate execution code for Power 8 Altivec in pstore_partial. 2022-11-16 13:41:42 +00:00
Gabriele Buondonno
6431dfdb50 Cross product for vectors of size 2. Fixes #1037 2022-11-15 22:39:42 +00:00
Antonio Sánchez
8588d8c74b Correct pnegate for floating-point zero. 2022-11-15 18:07:23 +00:00
Antonio Sanchez
5eacb9e117 Put brackets around unsigned type names. 2022-11-15 09:09:45 -08:00
Antonio Sánchez
37e40dca85 Fix ambiguity in PPC for vec_splats call. 2022-11-14 18:58:16 +00:00
Charles Schlosser
9b6d624eab fix neon 2022-11-08 20:03:01 +00:00
Rasmus Munk Larsen
7e398e9436 Add missing return keyword in psignbit for NEON. 2022-11-04 16:13:09 +00:00
Charles Schlosser
82b152dbe7 Add signbit function 2022-11-04 00:31:20 +00:00
Antonio Sanchez
01a31b81b2 Remove unused parameter name. 2022-11-01 15:51:25 -07:00
Antonio Sánchez
c5b896c5a3 Allow empty matrices to be resized. 2022-10-27 20:33:35 +00:00
Antonio Sánchez
886aad1361 Disable patan for double on PPC. 2022-10-27 17:56:08 +00:00
Antonio Sánchez
ab407b2b6e Fix handmade_aligned_malloc offset computation. 2022-10-27 17:33:47 +00:00
Charles Schlosser
a226371371 Change handmade_aligned_malloc/realloc/free to store a 1 byte offset instead of absolute address 2022-10-22 22:51:31 +00:00
Rasmus Munk Larsen
3bb6a48d8c Fix bug atan2 2022-10-12 23:49:32 +00:00
Rasmus Munk Larsen
14c847dc0e Refactor special values test for pow, and add a similar test for atan2 2022-10-12 20:12:08 +00:00
Rasmus Munk Larsen
462758e8a3 Don't use generic sign function for sign(complex) unless it is vectorizable 2022-10-12 16:03:29 +00:00
Rasmus Munk Larsen
c0d6a72611 Use pnegate(pzero(x)) as a generic way to generate -0.0. Some compiler do not handle the literal -0.0 properly in fastmath mode. 2022-10-12 01:57:05 +00:00
Rasmus Munk Larsen
3167544873 Handle NaN inputs to atan2. 2022-10-10 19:36:36 -07:00
Rasmus Munk Larsen
72db3f0fa5 Remove references to M_PI_2 and M_PI_4. 2022-10-11 00:27:16 +00:00
Rasmus Munk Larsen
e95c4a837f Simpler range reduction strategy for atan<float>(). 2022-10-04 18:11:00 +00:00
Antonio Sánchez
80efbfdeda Unconditionally enable CXX11 math. 2022-10-04 17:37:47 +00:00
Antonio Sánchez
e5794873cb Replace assert with eigen_assert. 2022-10-04 17:11:23 +00:00
Rasmus Munk Larsen
1414a76fa9 Only vectorize atan<double> for Altivec if VSX is available. 2022-10-03 22:06:58 +00:00
Rasmus Munk Larsen
c475228b28 Vectorize atan() for double. 2022-10-01 01:49:30 +00:00
Rasmus Munk Larsen
1e1848fdb1 Add a vectorized implementation of atan2 to Eigen. 2022-09-28 20:46:49 +00:00
Rasmus Munk Larsen
b3bf8d6a13 Try to reduce size of GEBP kernel for non-ARM targets. 2022-09-28 02:37:18 +00:00
Rasmus Munk Larsen
13b69fc1b0 Try to reduce compilation time/memory for GEBP kernel using EIGEN_IF_CONSTEXPR 2022-09-23 20:09:42 +00:00
Rasmus Munk Larsen
ed8cda3ce4 Move EIGEN_NEON_GEBP_NR macro to the right place in GeneralBlockPanelKernel.h 2022-09-23 02:24:27 +00:00
Rasmus Munk Larsen
e2ea866515 Add a macro to set the nr trait in the BEBP kernel for NEON. 2022-09-22 23:56:34 +00:00
Lianhuang Li
23299632c2 Use 3px8/2px8/1px8/1x8 gebp_kernel on arm64-neon 2022-09-21 16:36:40 +00:00
Rasmus Munk Larsen
7b2901e2aa Add vectorized integer division for int32 with AVX512, AVX or SSE. 2022-09-21 00:27:23 +00:00
Rasmus Munk Larsen
f913a40678 Revert "Add AVX int32_t pdiv"
This reverts commit ea84e7ad638c259397fc36fe6e3d82b9cb3b89d0
2022-09-16 22:48:08 +00:00
Rasmus Munk Larsen
273e0c884e Revert "Add constexpr, test for C++14 constexpr." 2022-09-16 21:14:29 +00:00
Charles Schlosser
ea84e7ad63 Add AVX int32_t pdiv 2022-09-16 17:06:29 +00:00