Antonio Sánchez
|
b396a6fbb2
|
Add free-function swap.
|
2024-10-14 15:51:40 +00:00 |
|
Charles Schlosser
|
820e8a45fb
|
add compile time info to reverse in place
|
2024-10-13 17:55:56 +00:00 |
|
Charles Schlosser
|
b55dab7f21
|
Fix DenseBase::tail for Dynamic template argument
|
2024-10-12 21:03:30 +00:00 |
|
Charles Schlosser
|
e0cbc55d92
|
Update README.md
|
2024-10-10 01:54:30 +00:00 |
|
Rasmus Munk Larsen
|
7eea0a9213
|
Vectorize erfc() for float
|
2024-10-09 18:38:05 +00:00 |
|
Rasmus Munk Larsen
|
78f3c654ee
|
Don't use constexpr with half.
|
2024-10-08 16:44:40 +00:00 |
|
Antonio Sánchez
|
6d7af238fa
|
Adjust array_cwise for 32-bit arm.
|
2024-10-07 23:15:24 +00:00 |
|
Rasmus Munk Larsen
|
74dcfbbd0f
|
Use ppolevl for polynomial evaluation in more places.
|
2024-10-07 13:27:28 -07:00 |
|
Rasmus Munk Larsen
|
a097f728fe
|
Avoid producing erf(x) = NaN for large |x|.
|
2024-10-04 12:15:23 -07:00 |
|
Rasmus Munk Larsen
|
44b16f48cb
|
Improve speed and accuracy or erf()
|
2024-10-03 01:52:16 +00:00 |
|
Antonio Sánchez
|
12068cbcdb
|
Fix inverse evaluator for running on CUDA device.
|
2024-10-01 20:59:54 +00:00 |
|
Rasmus Munk Larsen
|
4e8e5e7409
|
Add max_digits10 in NumTraits for mpreal types.
|
2024-10-01 18:57:17 +00:00 |
|
Rasmus Munk Larsen
|
8e8c319087
|
Add missing EIGEN_DEVICE_FUNC annotations.
|
2024-10-01 11:40:58 -07:00 |
|
Charles Schlosser
|
7ad7c1d5c5
|
fix implicit conversion warning (again)
|
2024-09-24 22:07:00 +00:00 |
|
Charles Schlosser
|
d052b7f864
|
add extra debugging info to float_pow_test_impl, clean up array_cwise tests
|
2024-09-24 21:08:22 +00:00 |
|
Charles Schlosser
|
ba5183f98c
|
fix warning in EigenSolver::pseudoEigenvalueMatrix()
|
2024-09-24 17:23:58 +00:00 |
|
Charles Schlosser
|
3ffb4e50df
|
fix implicit conversion in TensorChipping
|
2024-09-24 16:58:49 +00:00 |
|
Sean McBride
|
b6b8b54e5e
|
Fixed issue #2858: removed unneeded call to _mm_setzero_si128
|
2024-09-24 16:29:45 +00:00 |
|
Frédéric BRIOL
|
2a3465102a
|
Refactor code to use constexpr for data() functions.
|
2024-09-23 16:43:53 +00:00 |
|
Charles Schlosser
|
2d4c9b400c
|
make fixed size matrices and arrays trivially_copy_constructible and trivially_move_constructible
|
2024-09-17 17:43:36 +00:00 |
|
Antonio Sánchez
|
132f281f50
|
Fix generic ceil for SSE2.
|
2024-09-14 01:31:21 +00:00 |
|
Charles Schlosser
|
84282c42fc
|
optimize new dot product
|
2024-09-11 21:40:43 +00:00 |
|
Charles Schlosser
|
fb477b8be1
|
Better dot products
|
2024-09-10 21:02:31 +00:00 |
|
Sophie Chang
|
134b526d61
|
Update NonBlockingThreadPool.h plain asserts to use eigen_plain_assert
|
2024-09-10 00:18:27 +00:00 |
|
qile lin
|
072ec9d954
|
Fix a bug for pcmp_lt_or_nan and Add sqrt support for SVE
|
2024-09-04 21:45:39 +00:00 |
|
Rasmus Munk Larsen
|
9315389795
|
Fix bug in bug fix for atanh.
|
2024-09-04 09:37:59 -07:00 |
|
Rasmus Munk Larsen
|
f33af052e0
|
Fix bug for atanh(-1).
|
2024-09-03 20:54:01 +00:00 |
|
Rasmus Munk Larsen
|
66927f7807
|
Fix out-of-range arguments to _mm_permute_pd.
|
2024-08-30 17:31:52 +00:00 |
|
Rasmus Munk Larsen
|
bbdabebf44
|
Vectorize atanh<double>. Make atanh(x) standard compliant for |x| >= 1.
|
2024-08-30 17:27:55 +00:00 |
|
Morris Hafner
|
26e2c4f617
|
Add nvc++ support
|
2024-08-30 12:34:48 +00:00 |
|
Eugene Zhulenev
|
c59332d74a
|
Detect "effectively inner/outer" chipping in TensorChipping
|
2024-08-29 17:49:59 +00:00 |
|
Charles Schlosser
|
648bce6cae
|
SSE/AVX Complex FMA
|
2024-08-29 17:37:57 +00:00 |
|
Charles Schlosser
|
c21a80be3d
|
BDCSVD: Suppress Wmaybe-uninitialized
|
2024-08-29 02:45:38 +00:00 |
|
Charles Schlosser
|
9d3d37c5b7
|
Complex Numtraits::HasSign and nmsub test
|
2024-08-28 03:02:47 +00:00 |
|
Valentin Sarthou
|
c5189ac656
|
Fix GeneralizedEigenSolver::eigenvectors() not appearing in documentation
|
2024-08-24 00:30:06 +00:00 |
|
qile lin
|
3b5a1b4157
|
sve instrinsics with "_x" suffix will be faster than "_z" suffix
|
2024-08-23 12:52:22 +00:00 |
|
Rasmus Munk Larsen
|
98f1ac5e65
|
Fix breakage in GPU build.
|
2024-08-23 06:08:37 +00:00 |
|
Charles Schlosser
|
231308f690
|
TensorVolumePatchOp: Suppress Wmaybe-uninitialized caused by unreachable code
|
2024-08-23 01:55:12 +00:00 |
|
Tobias Wood
|
2bf8fe1489
|
NEON Complex Intrinsics
|
2024-08-22 22:46:16 +00:00 |
|
Rasmus Munk Larsen
|
f91f8e9ab9
|
Consolidate float and double implementations of patan().
|
2024-08-21 20:44:18 +00:00 |
|
Charles Schlosser
|
87239e058a
|
vectorize squaredNorm() for complex types
|
2024-08-21 10:54:17 +00:00 |
|
Rasmus Munk Larsen
|
32d95bb097
|
Add vectorized implementation of tanh<double>
|
2024-08-21 02:29:45 +00:00 |
|
Rasmus Munk Larsen
|
cc240eea2f
|
Speed up and improve accuracy of tanh.
|
2024-08-16 23:46:28 +00:00 |
|
Rasmus Munk Larsen
|
92e373e6f5
|
Speed up StableNorm for non-trivial sizes and improve consistency between aligned and unaligned inputs.
|
2024-08-14 21:42:04 +00:00 |
|
Rasmus Munk Larsen
|
1dbc7581ec
|
Include <thread> for std::this_thread::yield().
|
2024-08-14 17:44:14 +00:00 |
|
Rasmus Munk Larsen
|
ab310943d6
|
Add a yield instruction in the two spinloops of the threaded matmul implementation.
|
2024-08-09 10:48:24 -07:00 |
|
Rasmus Munk Larsen
|
99ffad1971
|
A few cleanups to threaded product code and test.
|
2024-08-09 09:35:23 -07:00 |
|
Charles Schlosser
|
59498c96fe
|
SSE/AVX use fmaddsub for complex products
|
2024-08-05 21:26:05 +00:00 |
|
Rasmus Munk Larsen
|
1dcae7cefc
|
Revert "BDCSVD fix -Wmaybe-uninitialized"
This reverts merge request !1649
|
2024-08-05 18:17:01 +00:00 |
|
Tyler Veness
|
d14b0a4e53
|
Remove C++23 check around has_denorm deprecation suppression
|
2024-08-03 21:34:27 +00:00 |
|