12718 Commits

Author SHA1 Message Date
Antonio Sanchez
00776d1ba4 Remove branch name from nightly tag job. 2024-12-09 20:18:18 -08:00
Antonio Sanchez
7f23778593 Add tag to commit instead of branch 2024-12-09 07:47:48 -08:00
Antonio Sánchez
c30b35a310 Force tag to update to latest head. 2024-12-08 04:48:21 +00:00
Antonio Sánchez
a26ba67349 Add LICENSE file in correct place so it is picked up by gitlab. 2024-12-08 03:26:43 +00:00
Charles Schlosser
08c31c3ba6 try alpine for formatting 2024-12-08 01:09:33 +00:00
Antonio Sanchez
1ac1af62ef Update deploy job 2024-12-07 09:19:21 -08:00
Antonio Sánchez
7b6623af30 Fix special packetmath erfc flushing for ARM32. 2024-12-07 01:42:30 +00:00
Antonio Sánchez
fd48fbb260 Update rocm docker again again. 2024-12-06 22:13:53 +00:00
Antonio Sánchez
a885340ba5 Update rocm docker again. 2024-12-06 17:19:31 +00:00
Antonio Sanchez
45a8478d09 Update rocm docker image in CI. 2024-12-06 07:14:59 -08:00
Antonio Sánchez
de4afcf414 Add a deploy phase to the CI that tags the latest nightly pipeline if it passes. 2024-12-05 15:28:18 +00:00
Charles Schlosser
5e8916050b move constructor / move assignment doc strings 2024-12-04 17:42:20 +00:00
Charles Schlosser
77a073aaa8 fix checkformat ci stage 2024-12-04 02:45:52 +00:00
Charles Schlosser
41e46ed243 fix IOFormat alignment 2024-12-04 01:13:48 +00:00
Charles Schlosser
a0d32e40d9 fix map fill logic 2024-11-30 13:39:02 +00:00
Charles Schlosser
d34b100c13 Fix UB in setZero 2024-11-27 19:32:14 +00:00
Rasmus Munk Larsen
f19a6803c8 Refactor special case handling in pow(x,y) and revert to repeated squaring for <float,int> 2024-11-27 00:24:21 +00:00
Rasmus Munk Larsen
5064cb7d5e Add test for using pcast on scalars. 2024-11-25 22:27:26 -08:00
Rasmus Munk Larsen
1ea61a5d26 Improve pow(x,y): 25% speedup, increase accuracy for integer exponents. 2024-11-26 06:13:48 +00:00
Charles Schlosser
8ad4344ca7 optimize setConstant, setZero 2024-11-22 03:39:19 +00:00
Rasmus Munk Larsen
5610a13b77 Simplify and speed up pow() by 5-6% 2024-11-20 12:45:00 +00:00
Rasmus Munk Larsen
6c6ce9d06b Enable vectorized erf<double>(x) for SSE and AVX, which was accidentally removed in merge request 1750. 2024-11-19 22:14:29 +00:00
Rasmus Munk Larsen
e7c799b7c9 Prevent premature overflow to infinity in exp(x). The changes also provide a 3-4% speedup. 2024-11-19 13:08:18 -08:00
Rasmus Munk Larsen
00af47102d Revert 040180078d 2024-11-19 10:25:16 -08:00
Rasmus Munk Larsen
8ee6f8475a Speed up exp(x). 2024-11-19 17:50:34 +00:00
Charles Schlosser
93ec5450cb disable fill_n optimization for msvc 2024-11-19 01:38:48 +00:00
Rasmus Munk Larsen
0af6ab4b76 Remove unnecessary check for HasBlend trait. 2024-11-18 21:16:45 +00:00
Rasmus Munk Larsen
d5eec781b7 Get rid of redundant computation for large arguments to erf(x). 2024-11-18 10:51:58 -08:00
Tyler Veness
2fc63808e4 Fix C++20 constexpr test compilation failures 2024-11-18 01:56:55 +00:00
Rasmus Munk Larsen
5133c836c0 Vectorize erf(x) for double. 2024-11-16 19:05:16 +00:00
Conrad Poelman
d6e3b528b2 Update Assign_MKL.h to cast disparate enum type to int, so it can be compared... 2024-11-15 20:00:29 +00:00
breathe1
040180078d Ensure that destructor's needed by lldb make it into binary in non-inlined fashion 2024-11-15 17:15:09 +00:00
Tyler Veness
0fb2ed140d Make element accessors constexpr 2024-11-14 01:05:29 +00:00
Charles Schlosser
8b4efc8ed8 check_size_for_overflow: use numeric limits instead of c99 macro 2024-11-13 00:35:35 +00:00
Charles Schlosser
489dbbc651 make fixed_size matrices conform to std::is_standard_layout 2024-11-12 23:34:26 +00:00
Rasmus Munk Larsen
283d871a3f Add missing EIGEN_DEVICE_FUNCTION decorations. 2024-11-08 14:25:57 -08:00
Rasmus Munk Larsen
0d366f6532 Vectorize erfc(x) for double and improve erfc(x) for float. 2024-11-08 17:21:11 +00:00
Charles Schlosser
8adf43640e more avx predux_any 2024-11-07 19:58:48 +00:00
Charles Schlosser
bc424f617a add missing avx predux_any functions 2024-11-07 19:11:29 +00:00
Charles Schlosser
e52ac76ca3 use EIGEN_CPLUSPLUS instead of checking cpp version 2024-11-06 17:25:22 +00:00
Rasmus Munk Larsen
122be167cd Revert "make fixed-size objects trivially move assignable" 2024-11-06 01:09:38 +00:00
Tobias Wood
d49021212b Tensor Roll / Circular Shift / Rotate 2024-11-05 14:10:19 +00:00
Charles Schlosser
bb73be8a2e make fixed-size objects trivially move assignable 2024-11-04 17:55:27 +00:00
Antonio Sánchez
7fd305ecae Fix GPU builds. 2024-11-01 04:50:03 +00:00
Morris Hafner
c8267654f2 Don't use __builtin_alloca_with_align with nvc++ 2024-10-30 18:02:08 +00:00
Tyler Veness
84c446df2c Fix macro redefinition warning in FFTW test 2024-10-30 17:18:42 +00:00
Antonio Sánchez
a9584d8e3c Fix clang6 failures. 2024-10-30 14:41:50 +00:00
Antonio Sánchez
dd4c2805d9 Fix clang6 failures. 2024-10-29 22:18:30 +00:00
Antonio Sánchez
9e962d9c54 Fix OOB access in triangular matrix multiplication. 2024-10-29 19:07:07 +00:00
Antonio Sánchez
695e49d1bd Fix NVCC builds for CUDA 10+. 2024-10-29 18:38:14 +00:00