7434 Commits

Author SHA1 Message Date
Rasmus Munk Larsen
33f5f59614 Vectorize cbrt for float and double. 2025-04-17 23:31:20 +00:00
Charles Schlosser
5330960900 Enable packet segment in partial redux 2025-04-14 17:44:53 +00:00
Charles Schlosser
6266d430cc packet segment: also check DiagonalWrapper 2025-04-12 19:34:11 +00:00
Charles Schlosser
e39ad8badc fix constexpr in CoreEvaluators.h 2025-04-12 18:54:09 +00:00
Charles Schlosser
7aefb9f4d9 fix memset optimization for std::complex types 2025-04-12 16:20:09 +00:00
Charles Schlosser
73ca849a68 fix packetSegment for ArrayWrapper / MatrixWrapper 2025-04-12 12:12:48 +00:00
Charles Schlosser
28c3b26d53 masked load/store framework 2025-04-12 00:31:10 +00:00
Eugene Zhulenev
cebe09110c Fix a potential deadlock because of Eigen thread pool 2025-04-11 23:43:14 +00:00
William Kong
11fd34cc1c Fix the typing of the Tasks in ForkJoin.h 2025-04-09 17:21:36 +00:00
Hunter Belanger
2cd47d743e Fixe Conversion Warning in Parallelizer 2025-04-08 07:39:01 +00:00
Antonio Sánchez
b860042263 Add postream for ostream-ing packets more reliably. 2025-04-01 22:12:00 +00:00
Antonio Sánchez
02d9e1138a Add missing pmadd for Packet16bf. 2025-03-31 04:17:17 +00:00
Rasmus Munk Larsen
63a40ffb95 Use fma<float> for fma<half> and fma<bfloat16> if native fma is not available on the platform. 2025-03-28 04:26:04 +00:00
Antonio Sanchez
8e32cbf7da Reduce flakiness of test for Eigen::half. 2025-03-23 22:31:25 -07:00
Antonio Sánchez
d935916ac6 Add numext::fma and missing pmadd implementations. 2025-03-23 01:05:53 +00:00
Charles Schlosser
754bd24f5e fix 2828 2025-03-22 17:19:44 +00:00
Charles Schlosser
ac2165c11f fix allFinite 2025-03-20 16:04:46 +00:00
William Kong
3143968195 Generalize the Eigen ForkJoin scheduler to use any ThreadPool interface. 2025-03-19 19:56:21 +00:00
Antonio Sánchez
70f2aead9a Use native _Float16 for AVX512FP16 and update vectorization. 2025-03-19 19:55:26 +00:00
Markus Vieth
0259a52b0e
Use more .noalias() 2025-03-17 19:41:00 +01:00
Charles Schlosser
10e62ccd22 Fix x86 complex vectorized fma 2025-03-12 17:06:32 +00:00
Rasmus Munk Larsen
21223f6bb6 Fix addition of different enum types. 2025-03-07 22:18:00 +00:00
Kevin
43810fc1be Fix extra semicolon in DeviceWrapper 2025-03-07 01:07:23 +00:00
Charles Schlosser
d28041ed5a refactor AssignmentFunctors.h, unify with existing scalar_op 2025-03-06 01:28:39 +00:00
Antonio Sánchez
be5147b090 Fix STL feature detection for c++20. 2025-02-28 19:52:37 +00:00
Antonio Sánchez
d79bac0d3c Fix boolean scatter and random generation for tensors. 2025-02-25 21:37:09 +00:00
Rasmus Munk Larsen
72adf891d5 Slightly simplify ForkJoin code, and make sure the test is actually run. 2025-02-25 17:22:43 +00:00
Markus Vieth
bddaa99e15 Fix bitwise operation error when compiling as C++26 2025-02-23 02:30:55 +00:00
Tyler Veness
0ae7b59018 Make assignment constexpr 2025-02-21 18:16:46 +00:00
Charles Schlosser
4dda5b927a fix Warray-bounds in inner product 2025-02-20 22:40:55 +00:00
Charles Schlosser
151f6127df Fix Warray-bounds warning for fixed-size assignments 2025-02-18 19:23:14 +00:00
C. Antonio Sanchez
1d8b82b074 Fix power builds for no VSX and no POWER8. 2025-02-15 13:56:47 -08:00
Charles Schlosser
eb3f9f443d refactor AssignmentEvaluator 2025-02-15 00:39:41 +00:00
Antonio Sanchez
22cd7307dd Remove assumption of std::complex for complex scalar types. 2025-02-12 15:44:32 -08:00
Antonio Sánchez
6b4881ad48 Eliminate type-punning UB in Eigen::half. 2025-02-12 21:12:33 +00:00
Antonio Sánchez
becefd59e2 Returns condition number of zero if matrix is not invertible. 2025-02-12 07:09:20 +00:00
Antonio Sánchez
809d266b49 Fix numerical issues with BiCGSTAB. 2025-02-11 19:41:59 +00:00
Antonio Sánchez
4c38131a16 Fix android hardware_destructive_inference_size issue. 2025-02-05 23:53:55 +00:00
Antonio Sánchez
4c2611d27c Update check for std::hardware_destructive_interference_size 2025-02-05 19:41:07 +00:00
Antonio Sanchez
74264c391a Add missing return statements for ppc. 2025-02-05 08:12:27 -08:00
Antonio Sánchez
b73bb766a5 Increase max alignment to 256. 2025-02-04 20:06:28 +00:00
Antonio Sánchez
b1e74b1ccd Fix all the doxygen warnings. 2025-02-01 00:00:31 +00:00
Johannes Zipfel
2926b2e0a9 added functions to fetch L and U Factors from IncompleteLUT 2025-01-31 18:32:38 +00:00
William Kong
b6849f675d Change the midpoint chosen in Eigen::ForkJoinScheduler. 2025-01-30 20:21:30 +00:00
William Kong
1b2e84e55a Fix minor typos in ForkJoin.h 2025-01-29 20:12:04 +00:00
Tyler Veness
872c179f58
Fix UTF-8 encoding errors introduced by #1801 2025-01-28 16:52:46 -08:00
Rasmus Munk Larsen
2a35a917be Fix syntax error in NonBlockingThreadPool.h 2025-01-28 18:34:31 +00:00
Charles Schlosser
a056b93114 improve Simplicial Cholesky analyzePattern 2025-01-28 17:53:43 +00:00
William Kong
5d866a7a78 Fix potential data race on spin_count_ NonBlockingThreadPool member variable 2025-01-28 17:22:15 +00:00
William Kong
bc67025ba7 Clean up and fix the documentation of ForkJoin.h 2025-01-27 23:12:17 +00:00