7300 Commits

Author SHA1 Message Date
Rasmus Munk Larsen
cc240eea2f Speed up and improve accuracy of tanh. 2024-08-16 23:46:28 +00:00
Rasmus Munk Larsen
92e373e6f5 Speed up StableNorm for non-trivial sizes and improve consistency between aligned and unaligned inputs. 2024-08-14 21:42:04 +00:00
Rasmus Munk Larsen
1dbc7581ec Include <thread> for std::this_thread::yield(). 2024-08-14 17:44:14 +00:00
Rasmus Munk Larsen
ab310943d6 Add a yield instruction in the two spinloops of the threaded matmul implementation. 2024-08-09 10:48:24 -07:00
Rasmus Munk Larsen
99ffad1971 A few cleanups to threaded product code and test. 2024-08-09 09:35:23 -07:00
Charles Schlosser
59498c96fe SSE/AVX use fmaddsub for complex products 2024-08-05 21:26:05 +00:00
Rasmus Munk Larsen
1dcae7cefc Revert "BDCSVD fix -Wmaybe-uninitialized"
This reverts merge request !1649
2024-08-05 18:17:01 +00:00
Tyler Veness
d14b0a4e53 Remove C++23 check around has_denorm deprecation suppression 2024-08-03 21:34:27 +00:00
Jatin Chaudhary
24db460503 hlog symbol lookup should not restricted to global namespace 2024-08-03 03:59:13 +00:00
Alexander Grund
767e60e290 Fix Woverflow warnings in PacketMathFP16 2024-08-03 03:57:18 +00:00
Alexander Grund
8025683226 Fix conversion of Eigen::half to _Float16 in AVX512 code 2024-08-03 03:49:51 +00:00
Rasmus Munk Larsen
2b7b7aac57 Speed up complex * complex matrix multiplication. 2024-08-02 20:40:53 +00:00
Eugene Zhulenev
e44db21092 Optimize ThreadPool spinning 2024-08-02 19:18:34 +00:00
Mike Taves
c593e9e948 Fix typos 2024-08-02 00:06:24 +00:00
Eugene Zhulenev
fd98cc49f1 Avoid atomic false sharing in RunQueue 2024-08-01 17:41:16 +00:00
Charles Schlosser
3f06651fd6 BDCSVD fix -Wmaybe-uninitialized 2024-07-30 22:53:06 +00:00
Frédéric Chapoton
6331da95eb fixing a lot of typos 2024-07-30 22:15:49 +00:00
Rasmus Munk Larsen
d791d48859 Fix AVX512FP16 build failure 2024-06-18 22:34:32 +00:00
Charles Schlosser
2fae4d7a77 Revert "fix scalar pselect" 2024-06-15 20:02:28 +00:00
Charles Schlosser
b430eb31e2 AVX512F double->int64_t cast 2024-06-15 17:45:02 +00:00
Charles Schlosser
02bcf9b591 fix scalar pselect 2024-06-10 17:30:22 +00:00
Louis David
392b95bdf1 allow pointer_based_stl_iterator to conform to the contiguous_iterator concept if we are in c++20 2024-06-06 21:38:09 +00:00
Victor Ceballos
27f8176254 fixing warning C5054: operator '==': deprecated between enumerations of different types 2024-06-04 16:44:13 +03:00
Charles Schlosser
eac6355df2 Fix warnings created by other warnings fix 2024-06-01 03:37:04 +00:00
Rasmus Munk Larsen
7029a2e971 Vectorize allFinite() 2024-06-01 03:24:26 +00:00
Charles Schlosser
e605227030 Fix warnings 2024-05-31 14:33:37 +00:00
Rasmus Munk Larsen
38b9cc263b Fix warnings about repeated deinitions of macros. 2024-05-29 13:38:00 -07:00
Rasmus Munk Larsen
9148c47d67 Vectorize isfinite and isinf. 2024-05-29 00:20:12 +00:00
Tobias Wood
5a9f66fb35 Fix Thread tests 2024-05-24 16:50:14 +00:00
Tyler Veness
c4d84dfddc Fix compilation failures on constexpr matrices with GCC 14 2024-05-22 12:29:01 +00:00
Charles Schlosser
99adca8b34 Incorporate Threadpool in Eigen Core 2024-05-20 23:42:51 +00:00
Tyler Veness
d165c7377f Format EIGEN_STATIC_ASSERT() as a statement macro 2024-05-20 23:02:42 +00:00
Charles Schlosser
f78dfe36b0 use built in alloca with align if available 2024-05-19 19:32:49 +00:00
Tyler Veness
b9b1c8661e Suppress C++23 deprecation warnings for std::has_denorm and std::has_denorm_loss 2024-05-17 15:55:22 +00:00
Charlie Schlosser
3d2e738f29 fix performance-no-int-to-ptr 2024-05-16 23:25:42 -04:00
Rasmus Munk Larsen
5e4f3475b5 Remove call to deprecated method initParallel() in SparseDenseProduct.h 2024-05-15 23:12:32 +00:00
Charles Schlosser
59cf0df1d6 SparseMatrix::insert add checks for valid indices 2024-05-15 16:14:32 +00:00
Anabasis
c0fe6ce223 Fixed a clerical error at documentation of class Matrix. 2024-05-13 02:51:40 +00:00
Chip Kerchner
4d1d14e069 Change predux on PowerPC for Packet4i to NOT saturate the sum of the elements (like other architectures). 2024-05-08 22:39:27 +00:00
Charles Schlosser
8e47971789 Bit shifting functions 2024-05-03 18:55:02 +00:00
Rasmus Munk Larsen
9000b37677 Fix new generic nearest integer ops on GPU. 2024-04-30 22:18:25 +00:00
Charles Schlosser
0ee5c90aa9 Eigen transpose product 2024-04-30 13:32:52 +00:00
Charles Schlosser
fb95e90f7f Add truncation op 2024-04-29 23:45:49 +00:00
Jonathan Freed
d5524fc57b Remove unnecessary semicolons. 2024-04-29 21:31:26 +00:00
Antonio Sánchez
dcceb9afec Unbork avx512 preduce_mul on MSVC. 2024-04-26 15:28:03 +00:00
Antonio Sanchez
1c8c734c8b Fix sin/cos on PPC. 2024-04-24 15:58:03 -07:00
Charles Schlosser
34967b0b5b Revert "fix transposed matrix product bug"
This reverts merge request !1598
2024-04-23 14:07:11 +00:00
Charles Schlosser
574bc8820d fix transposed matrix product bug 2024-04-23 03:25:57 +00:00
Rasmus Munk Larsen
112ad8b846 Revert part of !1583, which may cause underflow on ARM. 2024-04-22 21:14:38 +00:00
ahmed
ee9d57347b Fix tridiagonalization_inplace_selector::run() when called from CUDA 2024-04-19 21:06:59 +00:00