Charles Schlosser
|
a08649994f
|
Optimize generic_rsqrt_newton_step
|
2023-03-24 22:42:57 +00:00 |
|
Rasmus Munk Larsen
|
b8b8a26145
|
Add more missing vectorized casts for int on x86, and remove redundant unit tests
|
2023-03-24 16:02:00 +00:00 |
|
unageek
|
33e206f714
|
Remove unused declarations of BLAS/LAPACK routines
|
2023-03-23 21:54:05 +00:00 |
|
Rasmus Munk Larsen
|
d57a79e512
|
Optimize float->bool cast for AVX2, based on Charles Schlosser's comments.
|
2023-03-21 20:59:25 -07:00 |
|
Rasmus Munk Larsen
|
a5ae832773
|
Fix reversal of arguments to _mm256_set_m128() in pcast<Packet4d, Packet8f>.
|
2023-03-22 03:21:44 +00:00 |
|
Rasmus Munk Larsen
|
09945f2cc1
|
Optimize casting for x86_64.
|
2023-03-21 18:24:16 +00:00 |
|
Colin Broderick
|
8f9b8e3630
|
Replaced all instances of internal::(U)IntPtr with std::(u)intptr_t. Remove ICC workaround.
|
2023-03-21 16:50:23 +00:00 |
|
Antonio Sánchez
|
2c8011c2dd
|
Fix arm builds.
|
2023-03-20 16:59:38 +00:00 |
|
Charles Schlosser
|
fd8f410bbe
|
Fix 2624 2625
|
2023-03-20 16:30:04 +00:00 |
|
Chip Kerchner
|
e887196d9d
|
Undo cmake pools changes
|
2023-03-17 16:06:26 +00:00 |
|
Jonas Schulze
|
81cb6a51d0
|
Fix some typos
|
2023-03-16 23:11:43 +00:00 |
|
Antonio Sánchez
|
555cec17ed
|
Fix parsing of command-line arguments when already specified as a cmake list.
|
2023-03-16 22:47:38 +00:00 |
|
Chip Kerchner
|
7db19baabe
|
Remove pools if cmake is less than 3.11
|
2023-03-16 16:54:45 +00:00 |
|
Rasmus Munk Larsen
|
0488b708b4
|
Vectorize tensor.isnan() by using typed predicates.
|
2023-03-16 04:04:22 +00:00 |
|
Rasmus Munk Larsen
|
f02856c640
|
Use EIGEN_NOT_A_MACRO macro (oh the irony!) to avoid build issue in TensorFlow.
|
2023-03-15 11:42:57 -07:00 |
|
Rasmus Munk Larsen
|
690ae9502f
|
Use C++11 standard features for detecting presence of Inf and NaN
|
2023-03-15 16:52:44 +00:00 |
|
Chip Kerchner
|
d71ac6a755
|
Fix recent PowerPC warnings and clang warning
|
2023-03-15 16:50:46 +00:00 |
|
Chip Kerchner
|
d54d228b49
|
Limit the number of build jobs to 8 and link jobs to 4 for PowerPC. This should help reduce the OOM build problems.
|
2023-03-15 16:29:41 +00:00 |
|
Chip Kerchner
|
23e1541863
|
Put deadcode checks back in from previous change.
|
2023-03-14 00:57:16 +00:00 |
|
Chip Kerchner
|
6c58f0fe1f
|
Revert changes that made BF16 GEMM to cause bad register spillage for LLVM (Power)
|
2023-03-13 23:36:06 +00:00 |
|
Rasmus Munk Larsen
|
8fe6190001
|
Add numext::isnan for AnnoyingOrange^H^H^H^H^H^HScalar.
|
2023-03-13 21:19:35 +00:00 |
|
Rasmus Munk Larsen
|
79de101d23
|
Handle PropagateFast the same way as PropagateNaN in minmax visitor to
|
2023-03-13 20:47:11 +00:00 |
|
Chip Kerchner
|
9d72412385
|
Add MMA to BF16 GEMV - 5.0-6.3X faster (for Power)
|
2023-03-13 19:37:13 +00:00 |
|
Rasmus Munk Larsen
|
2067b54b13
|
Fix bug in minmax_coeff_visitor for matrix of all NaNs.
|
2023-03-13 18:25:22 +00:00 |
|
Rasmus Munk Larsen
|
ee0ff0ab3a
|
Fix typo in MathFunctions.h
|
2023-03-13 15:50:40 +00:00 |
|
Rasmus Munk Larsen
|
21c49e8f8e
|
Delete mystery character from Eigen/src/Core/arch/NEON/MathFunctions.h
|
2023-03-10 23:27:24 +00:00 |
|
Rasmus Munk Larsen
|
6bb9609bcb
|
Make new Select implementation backwards compatible.
|
2023-03-10 23:07:47 +00:00 |
|
Antonio Sánchez
|
394aabb0a3
|
Fix failing MSVC tests due to compiler bugs.
|
2023-03-10 22:36:57 +00:00 |
|
Rasmus Munk Larsen
|
d6235d76db
|
Clean up generic packetmath specializations for various backends with the help of a macro.
|
2023-03-10 22:02:23 +00:00 |
|
Rasmus Munk Larsen
|
e8fdf127c6
|
Work around compiler bug in Tridiagonalization.h
|
2023-03-10 21:21:07 +00:00 |
|
Rasmus Munk Larsen
|
adf26b6840
|
Add newline to end of file.
|
2023-03-10 16:53:22 +00:00 |
|
Rasmus Munk Larsen
|
3492d9e2e5
|
s/Lesser/Less/
|
2023-03-10 00:28:31 +00:00 |
|
Rasmus Munk Larsen
|
2419632cf5
|
Revert change to allFinite(), since the new version does not work for complex numbers.
|
2023-03-09 21:50:43 +00:00 |
|
Zach Davis
|
b1beba8a3e
|
Fix LinAlgSVD example code
|
2023-03-08 17:04:59 +00:00 |
|
Charles Schlosser
|
7bf2968fed
|
Specify Permutation Index for PartialPivLU and FullPivLU
|
2023-03-07 20:28:05 +00:00 |
|
Antonio Sánchez
|
eb4dbf6135
|
Modify failing cwise test to get it to pass.
|
2023-03-07 19:47:42 +00:00 |
|
Timofey Pushkin
|
e577f43ab2
|
Set CMAKE_* cache variables only when Eigen is a top-level project
|
2023-03-07 14:39:45 +00:00 |
|
Charles Schlosser
|
1ce8b25825
|
Vectorize any() / all()
|
2023-03-06 23:54:02 +00:00 |
|
Charles Schlosser
|
cb8e6d4975
|
Fix 2240, 2620
|
2023-03-06 23:11:06 +00:00 |
|
Charles Schlosser
|
d670039309
|
fix tensor comparison test
|
2023-03-06 13:11:14 +00:00 |
|
Chip Kerchner
|
2b513ca2a0
|
Added partial linear access for LHS & Output - 30% faster for bfloat16 GEMM MMA (Power)
|
2023-03-02 19:22:43 +00:00 |
|
Charles Schlosser
|
0b396c3167
|
Scalarize comps
|
2023-03-02 17:06:23 +00:00 |
|
Charles Schlosser
|
3abe12472e
|
fix signed shift test
|
2023-03-01 14:31:13 +00:00 |
|
Antonio Sánchez
|
ba7417f146
|
Fix gpu conv3d out-of-resources failure.
|
2023-02-28 21:25:00 +00:00 |
|
Antonio Sánchez
|
62d5cfe835
|
Fix ODR issues with Intel's AVX512 TRSM kernels.
|
2023-02-27 07:54:52 +00:00 |
|
Charles Schlosser
|
826627f653
|
vectorize comparisons and select by enabling typed comparisons
|
2023-02-25 20:52:11 +00:00 |
|
Rasmus Munk Larsen
|
2e9b945baf
|
Fix bug that disabled vectorization for coeffMin/coeffMax.
|
2023-02-25 20:03:54 +00:00 |
|
Antonio Sánchez
|
bc5cdc7a67
|
Guard use of long double on GPU device.
|
2023-02-24 21:49:59 +00:00 |
|
Chip Kerchner
|
e4598fedbe
|
Fix compiler versions for certain instructions on Power.
|
2023-02-23 23:24:41 +00:00 |
|
Rasmus Munk Larsen
|
1c0a6cf228
|
Get rid of EIGEN_HAS_AVX512_MATH workaround.
|
2023-02-23 23:16:41 +00:00 |
|