Antonio Sanchez
3580a38298
Use native _Float16 for AVX512FP16 and update vectorization.
...
This allows us to do faster native scalar operations. Also
updated half/quarter packets to use the native type if available.
Benchmark improvement:
```
Comparing ./2910_without_float16 to ./2910_with_float16
Benchmark Time CPU Time Old Time New CPU Old CPU New
------------------------------------------------------------------------------------------------------------------------------------
BM_CalcMat<float>/10000/768/500 -0.0041 -0.0040 58276392 58039442 58273420 58039582
BM_CalcMat<_Float16>/10000/768/500 +0.0073 +0.0073 642506339 647214446 642481384 647188303
BM_CalcMat<Eigen::half>/10000/768/500 -0.3170 -0.3170 92511115 63182101 92506771 63179258
BM_CalcVec<float>/10000/768/500 +0.0022 +0.0022 5198157 5209469 5197913 5209334
BM_CalcVec<_Float16>/10000/768/500 +0.0025 +0.0026 10133324 10159111 10132641 10158507
BM_CalcVec<Eigen::half>/10000/768/500 -0.7760 -0.7760 45337937 10156952 45336532 10156389
OVERALL_GEOMEAN -0.2677 -0.2677 0 0 0 0
```
Fixes #2910 .
2025-03-18 10:46:32 -07:00
Charles Schlosser
10e62ccd22
Fix x86 complex vectorized fma
2025-03-12 17:06:32 +00:00
Antonio Sanchez
179a49684a
Fix CMake BOOST warning
2025-02-28 07:33:26 -08:00
Antonio Sánchez
d79bac0d3c
Fix boolean scatter and random generation for tensors.
2025-02-25 21:37:09 +00:00
Tyler Veness
9935396b15
Specify constructor template arguments for ConstexprTest struct
2025-02-25 19:38:47 +00:00
Rasmus Munk Larsen
72adf891d5
Slightly simplify ForkJoin code, and make sure the test is actually run.
2025-02-25 17:22:43 +00:00
Antonio Sanchez
5fc6fc9881
Initialize matrix in bicgstab test
2025-02-21 10:27:29 -08:00
Tyler Veness
0ae7b59018
Make assignment constexpr
2025-02-21 18:16:46 +00:00
Charles Schlosser
151f6127df
Fix Warray-bounds warning for fixed-size assignments
2025-02-18 19:23:14 +00:00
Antonio Sanchez
22cd7307dd
Remove assumption of std::complex for complex scalar types.
2025-02-12 15:44:32 -08:00
Antonio Sánchez
becefd59e2
Returns condition number of zero if matrix is not invertible.
2025-02-12 07:09:20 +00:00
Antonio Sánchez
809d266b49
Fix numerical issues with BiCGSTAB.
2025-02-11 19:41:59 +00:00
Johannes Zipfel
2926b2e0a9
added functions to fetch L and U Factors from IncompleteLUT
2025-01-31 18:32:38 +00:00
William Kong
4a6ac97d13
Add a ForkJoin-based ParallelFor algorithm to the ThreadPool module
2025-01-24 22:12:05 +00:00
Rasmus Munk Larsen
5064cb7d5e
Add test for using pcast on scalars.
2024-11-25 22:27:26 -08:00
Charles Schlosser
8ad4344ca7
optimize setConstant, setZero
2024-11-22 03:39:19 +00:00
breathe1
040180078d
Ensure that destructor's needed by lldb make it into binary in non-inlined fashion
2024-11-15 17:15:09 +00:00
Tyler Veness
0fb2ed140d
Make element accessors constexpr
2024-11-14 01:05:29 +00:00
Charles Schlosser
489dbbc651
make fixed_size matrices conform to std::is_standard_layout
2024-11-12 23:34:26 +00:00
Rasmus Munk Larsen
122be167cd
Revert "make fixed-size objects trivially move assignable"
2024-11-06 01:09:38 +00:00
Charles Schlosser
bb73be8a2e
make fixed-size objects trivially move assignable
2024-11-04 17:55:27 +00:00
Antonio Sánchez
dd4c2805d9
Fix clang6 failures.
2024-10-29 22:18:30 +00:00
Antonio Sánchez
dae09773fc
Don't pass matrices by value.
2024-10-29 18:19:02 +00:00
Rasmus Munk Larsen
c23ec3420e
Add tests for sizeof() with one dynamic dimension.
2024-10-28 13:48:53 -07:00
Peter Gavin
b15ebb1c2d
add nextafter for bfloat16
2024-10-26 00:08:25 +00:00
Rasmus Munk Larsen
53b83cddf9
Include <type_traits> in main.h for std::is_trivial*
2024-10-25 20:55:51 +00:00
Charles Schlosser
37563856c9
Fix stack allocation assert
2024-10-25 17:02:43 +00:00
Rasmus Munk Larsen
3f067c4850
Add exp2() as a packet op and array method.
2024-10-22 22:09:34 +00:00
Charles Schlosser
4e5136d239
make fixed size matrices and arrays trivially_default_constructible
2024-10-21 17:10:15 +00:00
Antonio Sánchez
b396a6fbb2
Add free-function swap.
2024-10-14 15:51:40 +00:00
Antonio Sánchez
6d7af238fa
Adjust array_cwise for 32-bit arm.
2024-10-07 23:15:24 +00:00
Charles Schlosser
d052b7f864
add extra debugging info to float_pow_test_impl, clean up array_cwise tests
2024-09-24 21:08:22 +00:00
Rasmus Munk Larsen
f33af052e0
Fix bug for atanh(-1).
2024-09-03 20:54:01 +00:00
Charles Schlosser
9d3d37c5b7
Complex Numtraits::HasSign and nmsub test
2024-08-28 03:02:47 +00:00
Rasmus Munk Larsen
f91f8e9ab9
Consolidate float and double implementations of patan().
2024-08-21 20:44:18 +00:00
Rasmus Munk Larsen
99ffad1971
A few cleanups to threaded product code and test.
2024-08-09 09:35:23 -07:00
Mike Taves
c593e9e948
Fix typos
2024-08-02 00:06:24 +00:00
Rasmus Munk Larsen
38b9cc263b
Fix warnings about repeated deinitions of macros.
2024-05-29 13:38:00 -07:00
Rasmus Munk Larsen
f02f89bf2c
Don't redefine EIGEN_DEFAULT_IO_FORMAT in main.h.
2024-05-29 18:14:32 +00:00
Tyler Veness
c4d84dfddc
Fix compilation failures on constexpr matrices with GCC 14
2024-05-22 12:29:01 +00:00
Charles Schlosser
99adca8b34
Incorporate Threadpool in Eigen Core
2024-05-20 23:42:51 +00:00
Tyler Veness
d165c7377f
Format EIGEN_STATIC_ASSERT() as a statement macro
2024-05-20 23:02:42 +00:00
Antonio Sánchez
de8013fa67
Fix ubsan failure in array_for_matrix
2024-05-16 18:47:36 +00:00
Antonio Sánchez
afb17288cb
Fix gcc6 compile error.
2024-05-10 19:13:21 +00:00
Charles Schlosser
8e47971789
Bit shifting functions
2024-05-03 18:55:02 +00:00
Antonio Sánchez
c1d637433e
Judge unitary-ness relative to scaling.
2024-04-30 22:28:46 +00:00
Charles Schlosser
0ee5c90aa9
Eigen transpose product
2024-04-30 13:32:52 +00:00
Charles Schlosser
fb95e90f7f
Add truncation op
2024-04-29 23:45:49 +00:00
Antonio Sánchez
a5e147305b
Fix undefined behavior for generating inputs to the predux_mul test.
2024-04-29 20:32:09 +00:00
Antonio Sánchez
dcceb9afec
Unbork avx512 preduce_mul on MSVC.
2024-04-26 15:28:03 +00:00