Rasmus Munk Larsen
efe5b6979d
Unconditionally include <memory>. Some c++20 builds are currently broken because it is needed for std::assume_aligned.
2025-07-18 18:06:28 +00:00
Charles Schlosser
d0b490ee09
Optimize maxCoeff and friends
2025-06-06 14:55:49 +00:00
Charles Schlosser
4fdf87bbf5
clean up intel packet reductions
2025-05-30 19:18:07 +00:00
Antonio Sánchez
70f2aead9a
Use native _Float16 for AVX512FP16 and update vectorization.
2025-03-19 19:55:26 +00:00
Antonio Sánchez
b1e74b1ccd
Fix all the doxygen warnings.
2025-02-01 00:00:31 +00:00
Pengzhou0810
e986838464
Add LoongArch64 architecture LSX support.(build/test )
2025-01-20 18:37:44 +00:00
Charles Schlosser
8ad4344ca7
optimize setConstant, setZero
2024-11-22 03:39:19 +00:00
Charles Schlosser
fb477b8be1
Better dot products
2024-09-10 21:02:31 +00:00
Rasmus Munk Larsen
1dbc7581ec
Include <thread> for std::this_thread::yield().
2024-08-14 17:44:14 +00:00
Tobias Wood
5a9f66fb35
Fix Thread tests
2024-05-24 16:50:14 +00:00
Charles Schlosser
99adca8b34
Incorporate Threadpool in Eigen Core
2024-05-20 23:42:51 +00:00
Charles Schlosser
e63d9f6ccb
Fix random again
2024-03-29 21:49:27 +00:00
Cheng Wang
2c6b61c006
Add half and quarter vector support to HVX architecture
2024-01-22 21:23:21 +00:00
Tobias Wood
f38e16c193
Apply clang-format
2023-11-29 11:12:48 +00:00
Rasmus Munk Larsen
76e8c04553
Generalize parallel GEMM implementation in Core to work with ThreadPool in addition to OpenMP.
2023-11-10 17:42:30 +00:00
cheng wang
66e8f38891
Add architecture definition files for Qualcomm Hexagon Vector Extension (HVX)
2023-08-01 17:47:57 +00:00
Charles Schlosser
59b3ef5409
Partially Vectorize Cast
2023-06-09 16:54:31 +00:00
Charles Schlosser
fbf7189bd5
Fix cuda compilation
2023-05-08 16:15:47 +00:00
Mehdi Goli
0623791930
[SYCL-2020] Enabling USM support for SYCL. SYCL-1.2.1 did not have support for USM.
2023-05-05 17:30:36 +00:00
Antonio Sánchez
2d0c6ad873
Revert "Vectorize cast"
...
This reverts commit eb5ff1861a4783876564a1a79573c3b9ff566863
2023-04-26 18:03:36 +00:00
Charles Schlosser
eb5ff1861a
Vectorize cast
2023-04-26 02:50:13 +00:00
Chip Kerchner
3f3ce214e6
New BF16 pcast functions and move type casting to TypeCasting.h
2023-04-18 02:38:38 +00:00
Charles Schlosser
1ce8b25825
Vectorize any() / all()
2023-03-06 23:54:02 +00:00
Antonio Sánchez
3f7e775715
Add IWYU export pragmas to top-level headers.
2023-02-08 17:40:31 +00:00
Charles Schlosser
2a90653395
fix lapacke config
2023-02-03 16:40:08 +00:00
Antonio Sánchez
08c961e837
Add custom ODR-safe assert.
2023-01-20 17:38:13 +00:00
Sean McBride
d70b4864d9
issue #2581 : review and cleanup of compiler version checks
2023-01-17 18:58:34 +00:00
Mehdi Goli
b523120687
[SYCL-2020 Support] Enabling Intel DPCPP Compiler support to Eigen
2023-01-16 07:04:08 +00:00
Alexander Richardson
37de432907
Avoid using std::raise() for divide by zero
2022-12-14 20:06:16 +00:00
Rasmus Munk Larsen
7b2901e2aa
Add vectorized integer division for int32 with AVX512, AVX or SSE.
2022-09-21 00:27:23 +00:00
Thomas Gloor
ec9c7163a3
Feature/skew symmetric matrix3
2022-09-08 20:44:40 +00:00
Matthew Sterrett
7a3b667c43
Add support for AVX512-FP16 for vectorizing half precision math
2022-08-17 18:15:21 +00:00
Chip Kerchner
9e0afe0f02
Fix non-VSX PowerPC build
2022-08-08 18:18:17 +00:00
Alexander Richardson
b7668c0371
Avoid including <sstream> with EIGEN_NO_IO
2022-07-29 18:02:51 +00:00
aaraujom
d49ede4dc4
Add AVX512 s/dgemm optimizations for compute kernel (2nd try)
2022-05-28 02:00:21 +00:00
Antonio Sánchez
9b9496ad98
Revert "Add AVX512 optimizations for matrix multiply"
...
This reverts commit 25db0b4a824ba9a092bbb514fbada51bf9d37a18
2022-05-13 18:50:33 +00:00
aaraujom
25db0b4a82
Add AVX512 optimizations for matrix multiply
2022-05-12 23:41:19 +00:00
Shi, Brian
fc1d888415
Remove AVX512VL dependency in trsm
2022-04-14 12:44:24 -07:00
Antonio Sánchez
07db964bde
Restrict new AVX512 trsm to AVX512VL, rename files for consistency.
2022-04-14 16:58:32 +00:00
b-shi
518fc321cb
AVX512 Optimizations for Triangular Solve
2022-03-16 18:04:50 +00:00
Erik Schultheis
cc11e240ac
Some further cleanup
2021-12-06 18:01:15 +00:00
Rasmus Munk Larsen
3ffefcb95c
Only include <atomic> if needed.
2021-12-02 23:55:25 +00:00
Erik Schultheis
ec2fd0f7ed
Require recent GCC and MSCV and removed EIGEN_HAS_CXX14
and some other feature test macros
2021-12-01 00:48:34 +00:00
Erik Schultheis
ec4efbd696
remove EIGEN_HAS_CXX11
2021-11-24 20:08:49 +00:00
Alex Druinsky
6bb6a6bf53
Vectorize fp16 tanh and logistic functions on Neon
...
Activates vectorization of the Eigen::half versions of the tanh and
logistic functions when they run on Neon. Both functions convert their
inputs to float before computing the output, and as a result of this
commit, the conversions and the computation in float are vectorized.
2021-10-27 16:09:16 +00:00
Antonio Sanchez
d0d34524a1
Move CUDA/Complex.h to GPU/Complex.h, remove TensorReductionCuda.h
...
The `Complex.h` file applies equally to HIP/CUDA, so placing under the
generic `GPU` folder.
The `TensorReductionCuda.h` has already been deprecated, now removing
for the next Eigen version.
2021-10-20 12:00:19 -07:00
Rasmus Munk Larsen
d7d0bf832d
Issue an error in case of direct inclusion of internal headers.
2021-09-10 19:12:26 +00:00
Antonio Sanchez
fcd73b4884
Add a simple serialization mechanism.
...
The `Serializer<T>` class implements a binary serialization that
can write to (`serialize`) and read from (`deserialize`) a byte
buffer. Also added convenience routines for serializing
a list of arguments.
This will mainly be for testing, specifically to transfer data to
and from the GPU.
2021-09-08 09:38:59 -07:00
Adam Kallai
1415817d8d
win: include intrin header in Windows on ARM
...
intrin header is needed for _BitScanReverse and
_BitScanReverse64
2021-08-31 10:57:34 +02:00
Antonio Sanchez
d24f9f9b55
Fix NVCC+ICC issues.
...
NVCC does not understand `__forceinline`, so we need to use `inline`
when compiling for GPU.
ICC specializes `std::complex` operators for `float` and `double`
by default, which cannot be used on device and conflict with Eigen's
workaround in CUDA/Complex.h. This can be prevented by defining
`_OVERRIDE_COMPLEX_SPECIALIZATION_` before including `<complex>`.
Added this define to the tests and to `Eigen/Core`, but this will
not work if the user includes `<complex>` before `<Eigen/Core>`.
ICC also seems to generate a duplicate `Map` symbol in
`PlainObjectBase`:
```
error: "Map" has already been declared in the current scope
static ConstMapType Map(const Scalar *data)
```
I tracked this down to `friend class Eigen::Map`. Putting the `friend`
statements at the bottom of the class seems to resolve this issue.
Fixes #2180
2021-03-15 18:42:04 +00:00