Rasmus Munk Larsen
|
33f5f59614
|
Vectorize cbrt for float and double.
|
2025-04-17 23:31:20 +00:00 |
|
Charles Schlosser
|
5330960900
|
Enable packet segment in partial redux
|
2025-04-14 17:44:53 +00:00 |
|
Charles Schlosser
|
6266d430cc
|
packet segment: also check DiagonalWrapper
|
2025-04-12 19:34:11 +00:00 |
|
Charles Schlosser
|
e39ad8badc
|
fix constexpr in CoreEvaluators.h
|
2025-04-12 18:54:09 +00:00 |
|
Charles Schlosser
|
7aefb9f4d9
|
fix memset optimization for std::complex types
|
2025-04-12 16:20:09 +00:00 |
|
Charles Schlosser
|
73ca849a68
|
fix packetSegment for ArrayWrapper / MatrixWrapper
|
2025-04-12 12:12:48 +00:00 |
|
Charles Schlosser
|
28c3b26d53
|
masked load/store framework
|
2025-04-12 00:31:10 +00:00 |
|
Eugene Zhulenev
|
cebe09110c
|
Fix a potential deadlock because of Eigen thread pool
|
2025-04-11 23:43:14 +00:00 |
|
William Kong
|
11fd34cc1c
|
Fix the typing of the Tasks in ForkJoin.h
|
2025-04-09 17:21:36 +00:00 |
|
Hunter Belanger
|
2cd47d743e
|
Fixe Conversion Warning in Parallelizer
|
2025-04-08 07:39:01 +00:00 |
|
Antonio Sánchez
|
b860042263
|
Add postream for ostream-ing packets more reliably.
|
2025-04-01 22:12:00 +00:00 |
|
Antonio Sánchez
|
02d9e1138a
|
Add missing pmadd for Packet16bf.
|
2025-03-31 04:17:17 +00:00 |
|
Rasmus Munk Larsen
|
63a40ffb95
|
Use fma<float> for fma<half> and fma<bfloat16> if native fma is not available on the platform.
|
2025-03-28 04:26:04 +00:00 |
|
Antonio Sanchez
|
8e32cbf7da
|
Reduce flakiness of test for Eigen::half.
|
2025-03-23 22:31:25 -07:00 |
|
Antonio Sánchez
|
d935916ac6
|
Add numext::fma and missing pmadd implementations.
|
2025-03-23 01:05:53 +00:00 |
|
Charles Schlosser
|
754bd24f5e
|
fix 2828
|
2025-03-22 17:19:44 +00:00 |
|
Charles Schlosser
|
ac2165c11f
|
fix allFinite
|
2025-03-20 16:04:46 +00:00 |
|
William Kong
|
3143968195
|
Generalize the Eigen ForkJoin scheduler to use any ThreadPool interface.
|
2025-03-19 19:56:21 +00:00 |
|
Antonio Sánchez
|
70f2aead9a
|
Use native _Float16 for AVX512FP16 and update vectorization.
|
2025-03-19 19:55:26 +00:00 |
|
Markus Vieth
|
0259a52b0e
|
Use more .noalias()
|
2025-03-17 19:41:00 +01:00 |
|
Charles Schlosser
|
10e62ccd22
|
Fix x86 complex vectorized fma
|
2025-03-12 17:06:32 +00:00 |
|
Rasmus Munk Larsen
|
21223f6bb6
|
Fix addition of different enum types.
|
2025-03-07 22:18:00 +00:00 |
|
Kevin
|
43810fc1be
|
Fix extra semicolon in DeviceWrapper
|
2025-03-07 01:07:23 +00:00 |
|
Charles Schlosser
|
d28041ed5a
|
refactor AssignmentFunctors.h, unify with existing scalar_op
|
2025-03-06 01:28:39 +00:00 |
|
Antonio Sánchez
|
be5147b090
|
Fix STL feature detection for c++20.
|
2025-02-28 19:52:37 +00:00 |
|
Antonio Sánchez
|
d79bac0d3c
|
Fix boolean scatter and random generation for tensors.
|
2025-02-25 21:37:09 +00:00 |
|
Rasmus Munk Larsen
|
72adf891d5
|
Slightly simplify ForkJoin code, and make sure the test is actually run.
|
2025-02-25 17:22:43 +00:00 |
|
Markus Vieth
|
bddaa99e15
|
Fix bitwise operation error when compiling as C++26
|
2025-02-23 02:30:55 +00:00 |
|
Tyler Veness
|
0ae7b59018
|
Make assignment constexpr
|
2025-02-21 18:16:46 +00:00 |
|
Charles Schlosser
|
4dda5b927a
|
fix Warray-bounds in inner product
|
2025-02-20 22:40:55 +00:00 |
|
Charles Schlosser
|
151f6127df
|
Fix Warray-bounds warning for fixed-size assignments
|
2025-02-18 19:23:14 +00:00 |
|
C. Antonio Sanchez
|
1d8b82b074
|
Fix power builds for no VSX and no POWER8.
|
2025-02-15 13:56:47 -08:00 |
|
Charles Schlosser
|
eb3f9f443d
|
refactor AssignmentEvaluator
|
2025-02-15 00:39:41 +00:00 |
|
Antonio Sanchez
|
22cd7307dd
|
Remove assumption of std::complex for complex scalar types.
|
2025-02-12 15:44:32 -08:00 |
|
Antonio Sánchez
|
6b4881ad48
|
Eliminate type-punning UB in Eigen::half.
|
2025-02-12 21:12:33 +00:00 |
|
Antonio Sánchez
|
becefd59e2
|
Returns condition number of zero if matrix is not invertible.
|
2025-02-12 07:09:20 +00:00 |
|
Antonio Sánchez
|
809d266b49
|
Fix numerical issues with BiCGSTAB.
|
2025-02-11 19:41:59 +00:00 |
|
Antonio Sánchez
|
4c38131a16
|
Fix android hardware_destructive_inference_size issue.
|
2025-02-05 23:53:55 +00:00 |
|
Antonio Sánchez
|
4c2611d27c
|
Update check for std::hardware_destructive_interference_size
|
2025-02-05 19:41:07 +00:00 |
|
Antonio Sanchez
|
74264c391a
|
Add missing return statements for ppc.
|
2025-02-05 08:12:27 -08:00 |
|
Antonio Sánchez
|
b73bb766a5
|
Increase max alignment to 256.
|
2025-02-04 20:06:28 +00:00 |
|
Antonio Sánchez
|
b1e74b1ccd
|
Fix all the doxygen warnings.
|
2025-02-01 00:00:31 +00:00 |
|
Johannes Zipfel
|
2926b2e0a9
|
added functions to fetch L and U Factors from IncompleteLUT
|
2025-01-31 18:32:38 +00:00 |
|
William Kong
|
b6849f675d
|
Change the midpoint chosen in Eigen::ForkJoinScheduler.
|
2025-01-30 20:21:30 +00:00 |
|
William Kong
|
1b2e84e55a
|
Fix minor typos in ForkJoin.h
|
2025-01-29 20:12:04 +00:00 |
|
Tyler Veness
|
872c179f58
|
Fix UTF-8 encoding errors introduced by #1801
|
2025-01-28 16:52:46 -08:00 |
|
Rasmus Munk Larsen
|
2a35a917be
|
Fix syntax error in NonBlockingThreadPool.h
|
2025-01-28 18:34:31 +00:00 |
|
Charles Schlosser
|
a056b93114
|
improve Simplicial Cholesky analyzePattern
|
2025-01-28 17:53:43 +00:00 |
|
William Kong
|
5d866a7a78
|
Fix potential data race on spin_count_ NonBlockingThreadPool member variable
|
2025-01-28 17:22:15 +00:00 |
|
William Kong
|
bc67025ba7
|
Clean up and fix the documentation of ForkJoin.h
|
2025-01-27 23:12:17 +00:00 |
|