Morris Hafner
|
26e2c4f617
|
Add nvc++ support
|
2024-08-30 12:34:48 +00:00 |
|
Eugene Zhulenev
|
c59332d74a
|
Detect "effectively inner/outer" chipping in TensorChipping
|
2024-08-29 17:49:59 +00:00 |
|
Charles Schlosser
|
648bce6cae
|
SSE/AVX Complex FMA
|
2024-08-29 17:37:57 +00:00 |
|
Charles Schlosser
|
c21a80be3d
|
BDCSVD: Suppress Wmaybe-uninitialized
|
2024-08-29 02:45:38 +00:00 |
|
Charles Schlosser
|
9d3d37c5b7
|
Complex Numtraits::HasSign and nmsub test
|
2024-08-28 03:02:47 +00:00 |
|
Valentin Sarthou
|
c5189ac656
|
Fix GeneralizedEigenSolver::eigenvectors() not appearing in documentation
|
2024-08-24 00:30:06 +00:00 |
|
qile lin
|
3b5a1b4157
|
sve instrinsics with "_x" suffix will be faster than "_z" suffix
|
2024-08-23 12:52:22 +00:00 |
|
Rasmus Munk Larsen
|
98f1ac5e65
|
Fix breakage in GPU build.
|
2024-08-23 06:08:37 +00:00 |
|
Charles Schlosser
|
231308f690
|
TensorVolumePatchOp: Suppress Wmaybe-uninitialized caused by unreachable code
|
2024-08-23 01:55:12 +00:00 |
|
Tobias Wood
|
2bf8fe1489
|
NEON Complex Intrinsics
|
2024-08-22 22:46:16 +00:00 |
|
Rasmus Munk Larsen
|
f91f8e9ab9
|
Consolidate float and double implementations of patan().
|
2024-08-21 20:44:18 +00:00 |
|
Charles Schlosser
|
87239e058a
|
vectorize squaredNorm() for complex types
|
2024-08-21 10:54:17 +00:00 |
|
Rasmus Munk Larsen
|
32d95bb097
|
Add vectorized implementation of tanh<double>
|
2024-08-21 02:29:45 +00:00 |
|
Rasmus Munk Larsen
|
cc240eea2f
|
Speed up and improve accuracy of tanh.
|
2024-08-16 23:46:28 +00:00 |
|
Rasmus Munk Larsen
|
92e373e6f5
|
Speed up StableNorm for non-trivial sizes and improve consistency between aligned and unaligned inputs.
|
2024-08-14 21:42:04 +00:00 |
|
Rasmus Munk Larsen
|
1dbc7581ec
|
Include <thread> for std::this_thread::yield().
|
2024-08-14 17:44:14 +00:00 |
|
Rasmus Munk Larsen
|
ab310943d6
|
Add a yield instruction in the two spinloops of the threaded matmul implementation.
|
2024-08-09 10:48:24 -07:00 |
|
Rasmus Munk Larsen
|
99ffad1971
|
A few cleanups to threaded product code and test.
|
2024-08-09 09:35:23 -07:00 |
|
Charles Schlosser
|
59498c96fe
|
SSE/AVX use fmaddsub for complex products
|
2024-08-05 21:26:05 +00:00 |
|
Rasmus Munk Larsen
|
1dcae7cefc
|
Revert "BDCSVD fix -Wmaybe-uninitialized"
This reverts merge request !1649
|
2024-08-05 18:17:01 +00:00 |
|
Tyler Veness
|
d14b0a4e53
|
Remove C++23 check around has_denorm deprecation suppression
|
2024-08-03 21:34:27 +00:00 |
|
Jatin Chaudhary
|
24db460503
|
hlog symbol lookup should not restricted to global namespace
|
2024-08-03 03:59:13 +00:00 |
|
Alexander Grund
|
767e60e290
|
Fix Woverflow warnings in PacketMathFP16
|
2024-08-03 03:57:18 +00:00 |
|
Alexander Grund
|
8025683226
|
Fix conversion of Eigen::half to _Float16 in AVX512 code
|
2024-08-03 03:49:51 +00:00 |
|
Alexey Korepanov
|
ec18dd09c8
|
fix pi in kissfft
|
2024-08-02 22:57:47 +00:00 |
|
Rasmus Munk Larsen
|
2b7b7aac57
|
Speed up complex * complex matrix multiplication.
|
2024-08-02 20:40:53 +00:00 |
|
Devon Loehr
|
b3e3b7b0ec
|
Remove implicit this capture in lambdas
|
2024-08-02 20:06:35 +00:00 |
|
Eugene Zhulenev
|
e44db21092
|
Optimize ThreadPool spinning
|
2024-08-02 19:18:34 +00:00 |
|
Mike Taves
|
c593e9e948
|
Fix typos
|
2024-08-02 00:06:24 +00:00 |
|
Eugene Zhulenev
|
fd98cc49f1
|
Avoid atomic false sharing in RunQueue
|
2024-08-01 17:41:16 +00:00 |
|
Charles Schlosser
|
0b646f3f36
|
Update file .clang-format
|
2024-08-01 03:18:50 +00:00 |
|
Charles Schlosser
|
1dcb07bb2a
|
Update file eigen_navtree_hacks.js
|
2024-08-01 02:51:04 +00:00 |
|
Charles Schlosser
|
ddb163ffb1
|
Update file .clang-format
|
2024-08-01 00:29:36 +00:00 |
|
Charles Schlosser
|
3f06651fd6
|
BDCSVD fix -Wmaybe-uninitialized
|
2024-07-30 22:53:06 +00:00 |
|
Frédéric Chapoton
|
6331da95eb
|
fixing a lot of typos
|
2024-07-30 22:15:49 +00:00 |
|
Alexander Hans
|
c29c800126
|
Fix formatting in README.md
|
2024-07-03 19:16:56 +00:00 |
|
adambanas
|
33d0937c6b
|
Add async support for 'chip' and 'extract_volume_patches'
|
2024-06-27 09:56:06 +02:00 |
|
Rasmus Munk Larsen
|
d791d48859
|
Fix AVX512FP16 build failure
|
2024-06-18 22:34:32 +00:00 |
|
Charles Schlosser
|
2fae4d7a77
|
Revert "fix scalar pselect"
|
2024-06-15 20:02:28 +00:00 |
|
Charles Schlosser
|
b430eb31e2
|
AVX512F double->int64_t cast
|
2024-06-15 17:45:02 +00:00 |
|
Charles Schlosser
|
02bcf9b591
|
fix scalar pselect
|
2024-06-10 17:30:22 +00:00 |
|
Louis David
|
392b95bdf1
|
allow pointer_based_stl_iterator to conform to the contiguous_iterator concept if we are in c++20
|
2024-06-06 21:38:09 +00:00 |
|
Victor Ceballos
|
27f8176254
|
fixing warning C5054: operator '==': deprecated between enumerations of different types
|
2024-06-04 16:44:13 +03:00 |
|
Charles Schlosser
|
eac6355df2
|
Fix warnings created by other warnings fix
|
2024-06-01 03:37:04 +00:00 |
|
Rasmus Munk Larsen
|
7029a2e971
|
Vectorize allFinite()
|
2024-06-01 03:24:26 +00:00 |
|
Charles Schlosser
|
e605227030
|
Fix warnings
|
2024-05-31 14:33:37 +00:00 |
|
Rasmus Munk Larsen
|
38b9cc263b
|
Fix warnings about repeated deinitions of macros.
|
2024-05-29 13:38:00 -07:00 |
|
Rasmus Munk Larsen
|
f02f89bf2c
|
Don't redefine EIGEN_DEFAULT_IO_FORMAT in main.h.
|
2024-05-29 18:14:32 +00:00 |
|
Rasmus Munk Larsen
|
9148c47d67
|
Vectorize isfinite and isinf.
|
2024-05-29 00:20:12 +00:00 |
|
Tobias Wood
|
5a9f66fb35
|
Fix Thread tests
|
2024-05-24 16:50:14 +00:00 |
|