7311 Commits

Author SHA1 Message Date
Morris Hafner
26e2c4f617 Add nvc++ support 2024-08-30 12:34:48 +00:00
Charles Schlosser
648bce6cae SSE/AVX Complex FMA 2024-08-29 17:37:57 +00:00
Charles Schlosser
c21a80be3d BDCSVD: Suppress Wmaybe-uninitialized 2024-08-29 02:45:38 +00:00
Charles Schlosser
9d3d37c5b7 Complex Numtraits::HasSign and nmsub test 2024-08-28 03:02:47 +00:00
Valentin Sarthou
c5189ac656 Fix GeneralizedEigenSolver::eigenvectors() not appearing in documentation 2024-08-24 00:30:06 +00:00
qile lin
3b5a1b4157 sve instrinsics with "_x" suffix will be faster than "_z" suffix 2024-08-23 12:52:22 +00:00
Rasmus Munk Larsen
98f1ac5e65 Fix breakage in GPU build. 2024-08-23 06:08:37 +00:00
Tobias Wood
2bf8fe1489 NEON Complex Intrinsics 2024-08-22 22:46:16 +00:00
Rasmus Munk Larsen
f91f8e9ab9 Consolidate float and double implementations of patan(). 2024-08-21 20:44:18 +00:00
Charles Schlosser
87239e058a vectorize squaredNorm() for complex types 2024-08-21 10:54:17 +00:00
Rasmus Munk Larsen
32d95bb097 Add vectorized implementation of tanh<double> 2024-08-21 02:29:45 +00:00
Rasmus Munk Larsen
cc240eea2f Speed up and improve accuracy of tanh. 2024-08-16 23:46:28 +00:00
Rasmus Munk Larsen
92e373e6f5 Speed up StableNorm for non-trivial sizes and improve consistency between aligned and unaligned inputs. 2024-08-14 21:42:04 +00:00
Rasmus Munk Larsen
1dbc7581ec Include <thread> for std::this_thread::yield(). 2024-08-14 17:44:14 +00:00
Rasmus Munk Larsen
ab310943d6 Add a yield instruction in the two spinloops of the threaded matmul implementation. 2024-08-09 10:48:24 -07:00
Rasmus Munk Larsen
99ffad1971 A few cleanups to threaded product code and test. 2024-08-09 09:35:23 -07:00
Charles Schlosser
59498c96fe SSE/AVX use fmaddsub for complex products 2024-08-05 21:26:05 +00:00
Rasmus Munk Larsen
1dcae7cefc Revert "BDCSVD fix -Wmaybe-uninitialized"
This reverts merge request !1649
2024-08-05 18:17:01 +00:00
Tyler Veness
d14b0a4e53 Remove C++23 check around has_denorm deprecation suppression 2024-08-03 21:34:27 +00:00
Jatin Chaudhary
24db460503 hlog symbol lookup should not restricted to global namespace 2024-08-03 03:59:13 +00:00
Alexander Grund
767e60e290 Fix Woverflow warnings in PacketMathFP16 2024-08-03 03:57:18 +00:00
Alexander Grund
8025683226 Fix conversion of Eigen::half to _Float16 in AVX512 code 2024-08-03 03:49:51 +00:00
Rasmus Munk Larsen
2b7b7aac57 Speed up complex * complex matrix multiplication. 2024-08-02 20:40:53 +00:00
Eugene Zhulenev
e44db21092 Optimize ThreadPool spinning 2024-08-02 19:18:34 +00:00
Mike Taves
c593e9e948 Fix typos 2024-08-02 00:06:24 +00:00
Eugene Zhulenev
fd98cc49f1 Avoid atomic false sharing in RunQueue 2024-08-01 17:41:16 +00:00
Charles Schlosser
3f06651fd6 BDCSVD fix -Wmaybe-uninitialized 2024-07-30 22:53:06 +00:00
Frédéric Chapoton
6331da95eb fixing a lot of typos 2024-07-30 22:15:49 +00:00
Rasmus Munk Larsen
d791d48859 Fix AVX512FP16 build failure 2024-06-18 22:34:32 +00:00
Charles Schlosser
2fae4d7a77 Revert "fix scalar pselect" 2024-06-15 20:02:28 +00:00
Charles Schlosser
b430eb31e2 AVX512F double->int64_t cast 2024-06-15 17:45:02 +00:00
Charles Schlosser
02bcf9b591 fix scalar pselect 2024-06-10 17:30:22 +00:00
Louis David
392b95bdf1 allow pointer_based_stl_iterator to conform to the contiguous_iterator concept if we are in c++20 2024-06-06 21:38:09 +00:00
Victor Ceballos
27f8176254 fixing warning C5054: operator '==': deprecated between enumerations of different types 2024-06-04 16:44:13 +03:00
Charles Schlosser
eac6355df2 Fix warnings created by other warnings fix 2024-06-01 03:37:04 +00:00
Rasmus Munk Larsen
7029a2e971 Vectorize allFinite() 2024-06-01 03:24:26 +00:00
Charles Schlosser
e605227030 Fix warnings 2024-05-31 14:33:37 +00:00
Rasmus Munk Larsen
38b9cc263b Fix warnings about repeated deinitions of macros. 2024-05-29 13:38:00 -07:00
Rasmus Munk Larsen
9148c47d67 Vectorize isfinite and isinf. 2024-05-29 00:20:12 +00:00
Tobias Wood
5a9f66fb35 Fix Thread tests 2024-05-24 16:50:14 +00:00
Tyler Veness
c4d84dfddc Fix compilation failures on constexpr matrices with GCC 14 2024-05-22 12:29:01 +00:00
Charles Schlosser
99adca8b34 Incorporate Threadpool in Eigen Core 2024-05-20 23:42:51 +00:00
Tyler Veness
d165c7377f Format EIGEN_STATIC_ASSERT() as a statement macro 2024-05-20 23:02:42 +00:00
Charles Schlosser
f78dfe36b0 use built in alloca with align if available 2024-05-19 19:32:49 +00:00
Tyler Veness
b9b1c8661e Suppress C++23 deprecation warnings for std::has_denorm and std::has_denorm_loss 2024-05-17 15:55:22 +00:00
Charlie Schlosser
3d2e738f29 fix performance-no-int-to-ptr 2024-05-16 23:25:42 -04:00
Rasmus Munk Larsen
5e4f3475b5 Remove call to deprecated method initParallel() in SparseDenseProduct.h 2024-05-15 23:12:32 +00:00
Charles Schlosser
59cf0df1d6 SparseMatrix::insert add checks for valid indices 2024-05-15 16:14:32 +00:00
Anabasis
c0fe6ce223 Fixed a clerical error at documentation of class Matrix. 2024-05-13 02:51:40 +00:00
Chip Kerchner
4d1d14e069 Change predux on PowerPC for Packet4i to NOT saturate the sum of the elements (like other architectures). 2024-05-08 22:39:27 +00:00