12559 Commits

Author SHA1 Message Date
Antonio Sánchez
b396a6fbb2 Add free-function swap. 2024-10-14 15:51:40 +00:00
Charles Schlosser
820e8a45fb add compile time info to reverse in place 2024-10-13 17:55:56 +00:00
Charles Schlosser
b55dab7f21 Fix DenseBase::tail for Dynamic template argument 2024-10-12 21:03:30 +00:00
Charles Schlosser
e0cbc55d92 Update README.md 2024-10-10 01:54:30 +00:00
Rasmus Munk Larsen
7eea0a9213 Vectorize erfc() for float 2024-10-09 18:38:05 +00:00
Rasmus Munk Larsen
78f3c654ee Don't use constexpr with half. 2024-10-08 16:44:40 +00:00
Antonio Sánchez
6d7af238fa Adjust array_cwise for 32-bit arm. 2024-10-07 23:15:24 +00:00
Rasmus Munk Larsen
74dcfbbd0f Use ppolevl for polynomial evaluation in more places. 2024-10-07 13:27:28 -07:00
Rasmus Munk Larsen
a097f728fe Avoid producing erf(x) = NaN for large |x|. 2024-10-04 12:15:23 -07:00
Rasmus Munk Larsen
44b16f48cb Improve speed and accuracy or erf() 2024-10-03 01:52:16 +00:00
Antonio Sánchez
12068cbcdb Fix inverse evaluator for running on CUDA device. 2024-10-01 20:59:54 +00:00
Rasmus Munk Larsen
4e8e5e7409 Add max_digits10 in NumTraits for mpreal types. 2024-10-01 18:57:17 +00:00
Rasmus Munk Larsen
8e8c319087 Add missing EIGEN_DEVICE_FUNC annotations. 2024-10-01 11:40:58 -07:00
Charles Schlosser
7ad7c1d5c5 fix implicit conversion warning (again) 2024-09-24 22:07:00 +00:00
Charles Schlosser
d052b7f864 add extra debugging info to float_pow_test_impl, clean up array_cwise tests 2024-09-24 21:08:22 +00:00
Charles Schlosser
ba5183f98c fix warning in EigenSolver::pseudoEigenvalueMatrix() 2024-09-24 17:23:58 +00:00
Charles Schlosser
3ffb4e50df fix implicit conversion in TensorChipping 2024-09-24 16:58:49 +00:00
Sean McBride
b6b8b54e5e Fixed issue #2858: removed unneeded call to _mm_setzero_si128 2024-09-24 16:29:45 +00:00
Frédéric BRIOL
2a3465102a Refactor code to use constexpr for data() functions. 2024-09-23 16:43:53 +00:00
Charles Schlosser
2d4c9b400c make fixed size matrices and arrays trivially_copy_constructible and trivially_move_constructible 2024-09-17 17:43:36 +00:00
Antonio Sánchez
132f281f50 Fix generic ceil for SSE2. 2024-09-14 01:31:21 +00:00
Charles Schlosser
84282c42fc optimize new dot product 2024-09-11 21:40:43 +00:00
Charles Schlosser
fb477b8be1 Better dot products 2024-09-10 21:02:31 +00:00
Sophie Chang
134b526d61 Update NonBlockingThreadPool.h plain asserts to use eigen_plain_assert 2024-09-10 00:18:27 +00:00
qile lin
072ec9d954 Fix a bug for pcmp_lt_or_nan and Add sqrt support for SVE 2024-09-04 21:45:39 +00:00
Rasmus Munk Larsen
9315389795 Fix bug in bug fix for atanh. 2024-09-04 09:37:59 -07:00
Rasmus Munk Larsen
f33af052e0 Fix bug for atanh(-1). 2024-09-03 20:54:01 +00:00
Rasmus Munk Larsen
66927f7807 Fix out-of-range arguments to _mm_permute_pd. 2024-08-30 17:31:52 +00:00
Rasmus Munk Larsen
bbdabebf44 Vectorize atanh<double>. Make atanh(x) standard compliant for |x| >= 1. 2024-08-30 17:27:55 +00:00
Morris Hafner
26e2c4f617 Add nvc++ support 2024-08-30 12:34:48 +00:00
Eugene Zhulenev
c59332d74a Detect "effectively inner/outer" chipping in TensorChipping 2024-08-29 17:49:59 +00:00
Charles Schlosser
648bce6cae SSE/AVX Complex FMA 2024-08-29 17:37:57 +00:00
Charles Schlosser
c21a80be3d BDCSVD: Suppress Wmaybe-uninitialized 2024-08-29 02:45:38 +00:00
Charles Schlosser
9d3d37c5b7 Complex Numtraits::HasSign and nmsub test 2024-08-28 03:02:47 +00:00
Valentin Sarthou
c5189ac656 Fix GeneralizedEigenSolver::eigenvectors() not appearing in documentation 2024-08-24 00:30:06 +00:00
qile lin
3b5a1b4157 sve instrinsics with "_x" suffix will be faster than "_z" suffix 2024-08-23 12:52:22 +00:00
Rasmus Munk Larsen
98f1ac5e65 Fix breakage in GPU build. 2024-08-23 06:08:37 +00:00
Charles Schlosser
231308f690 TensorVolumePatchOp: Suppress Wmaybe-uninitialized caused by unreachable code 2024-08-23 01:55:12 +00:00
Tobias Wood
2bf8fe1489 NEON Complex Intrinsics 2024-08-22 22:46:16 +00:00
Rasmus Munk Larsen
f91f8e9ab9 Consolidate float and double implementations of patan(). 2024-08-21 20:44:18 +00:00
Charles Schlosser
87239e058a vectorize squaredNorm() for complex types 2024-08-21 10:54:17 +00:00
Rasmus Munk Larsen
32d95bb097 Add vectorized implementation of tanh<double> 2024-08-21 02:29:45 +00:00
Rasmus Munk Larsen
cc240eea2f Speed up and improve accuracy of tanh. 2024-08-16 23:46:28 +00:00
Rasmus Munk Larsen
92e373e6f5 Speed up StableNorm for non-trivial sizes and improve consistency between aligned and unaligned inputs. 2024-08-14 21:42:04 +00:00
Rasmus Munk Larsen
1dbc7581ec Include <thread> for std::this_thread::yield(). 2024-08-14 17:44:14 +00:00
Rasmus Munk Larsen
ab310943d6 Add a yield instruction in the two spinloops of the threaded matmul implementation. 2024-08-09 10:48:24 -07:00
Rasmus Munk Larsen
99ffad1971 A few cleanups to threaded product code and test. 2024-08-09 09:35:23 -07:00
Charles Schlosser
59498c96fe SSE/AVX use fmaddsub for complex products 2024-08-05 21:26:05 +00:00
Rasmus Munk Larsen
1dcae7cefc Revert "BDCSVD fix -Wmaybe-uninitialized"
This reverts merge request !1649
2024-08-05 18:17:01 +00:00
Tyler Veness
d14b0a4e53 Remove C++23 check around has_denorm deprecation suppression 2024-08-03 21:34:27 +00:00