190 Commits

Author SHA1 Message Date
Antonio Sánchez
d935916ac6 Add numext::fma and missing pmadd implementations. 2025-03-23 01:05:53 +00:00
Antonio Sánchez
70f2aead9a Use native _Float16 for AVX512FP16 and update vectorization. 2025-03-19 19:55:26 +00:00
Rasmus Munk Larsen
6c6ce9d06b Enable vectorized erf<double>(x) for SSE and AVX, which was accidentally removed in merge request 1750. 2024-11-19 22:14:29 +00:00
Rasmus Munk Larsen
8ee6f8475a Speed up exp(x). 2024-11-19 17:50:34 +00:00
Rasmus Munk Larsen
5133c836c0 Vectorize erf(x) for double. 2024-11-16 19:05:16 +00:00
Rasmus Munk Larsen
0d366f6532 Vectorize erfc(x) for double and improve erfc(x) for float. 2024-11-08 17:21:11 +00:00
Charles Schlosser
8adf43640e more avx predux_any 2024-11-07 19:58:48 +00:00
Charles Schlosser
bc424f617a add missing avx predux_any functions 2024-11-07 19:11:29 +00:00
Rasmus Munk Larsen
bbdabebf44 Vectorize atanh<double>. Make atanh(x) standard compliant for |x| >= 1. 2024-08-30 17:27:55 +00:00
Rasmus Munk Larsen
32d95bb097 Add vectorized implementation of tanh<double> 2024-08-21 02:29:45 +00:00
Frédéric Chapoton
6331da95eb fixing a lot of typos 2024-07-30 22:15:49 +00:00
Charles Schlosser
fb95e90f7f Add truncation op 2024-04-29 23:45:49 +00:00
Charles Schlosser
5635d37f46 more pblend optimizations 2024-04-19 02:02:27 +00:00
Rasmus Munk Larsen
b5feca5d03 Fix build for pblend and psin_double, pcos_double when AVX but not AVX2 is supported. 2024-04-16 16:12:41 +00:00
Damiano Franzò
888fca0e2b Simd sincos double 2024-04-15 21:12:32 +00:00
Charles Schlosser
6ad2ccea4e Eigen pblend 2024-04-15 16:19:53 +00:00
Rasmus Munk Larsen
3c9109238f Add support for Packet8l to AVX512. 2024-04-09 22:58:44 +00:00
Rasmus Munk Larsen
283d69294b Guard AVX2 implementation of psignbit in PacketMath.h 2024-04-03 21:03:26 +00:00
Charles Schlosser
776d86d8df AVX: guard Packet4l definition 2024-04-01 00:31:46 +00:00
Charles Schlosser
f75e2297db AVX2 - double->int64_t casting 2024-03-29 21:35:09 +00:00
Antonio Sánchez
c8d368bdaf More fixes for 32-bit. 2024-03-26 22:53:38 +00:00
Rasmus Munk Larsen
b86641a4c2 Add support for casting between double and int64_t for SSE and AVX2. 2024-03-22 22:32:29 +00:00
Antonio Sánchez
d883932586 Fix Packet*l for 32-bit builds. 2024-03-22 17:16:42 +00:00
Tobias Wood
f38e16c193 Apply clang-format 2023-11-29 11:12:48 +00:00
Antonio Sánchez
6e4d5d4832 Add IWYU private pragmas to internal headers. 2023-08-21 16:25:22 +00:00
Kevin Leonardic
d4b05454a7 Fix argument for _mm256_cvtps_ph imm parameter 2023-07-03 13:44:20 +02:00
Charles Schlosser
969c31eefc Fix AVX pstore 2023-06-15 01:47:38 +00:00
Charles Schlosser
8999525c29 AVX2: Packet4ul has pmul, abs2 2023-04-26 16:22:16 +00:00
Charles Schlosser
f6cf5dca80 Packet4ul does not have Abs2 2023-04-21 19:48:01 +00:00
Charles Schlosser
29c8e3c754 fix pow for uint32_t, disable pmul<Packet4ul> 2023-04-21 05:47:56 +00:00
Pedro Gonnet
17b5b4de58 Add Packet4ui, Packet8ui, and Packet4ul to the SSE/AVX PacketMath.h headers 2023-04-17 23:33:59 +00:00
Rasmus Munk Larsen
df1049ddf4 Small packet math cleanup. 2023-04-04 16:14:32 +00:00
Rasmus Munk Larsen
0488b708b4 Vectorize tensor.isnan() by using typed predicates. 2023-03-16 04:04:22 +00:00
Antonio Sánchez
394aabb0a3 Fix failing MSVC tests due to compiler bugs. 2023-03-10 22:36:57 +00:00
Rasmus Munk Larsen
ce62177b5b Vectorize atanh & add a missing definition and unit test for atan. 2023-02-21 03:14:05 +00:00
Sean McBride
d70b4864d9 issue #2581: review and cleanup of compiler version checks 2023-01-17 18:58:34 +00:00
Charles Schlosser
02805bd56c Fix AVX2 psignbit 2022-11-16 13:43:11 +00:00
Antonio Sánchez
8588d8c74b Correct pnegate for floating-point zero. 2022-11-15 18:07:23 +00:00
Charles Schlosser
82b152dbe7 Add signbit function 2022-11-04 00:31:20 +00:00
Rasmus Munk Larsen
c475228b28 Vectorize atan() for double. 2022-10-01 01:49:30 +00:00
Rasmus Munk Larsen
7b2901e2aa Add vectorized integer division for int32 with AVX512, AVX or SSE. 2022-09-21 00:27:23 +00:00
Rasmus Munk Larsen
f913a40678 Revert "Add AVX int32_t pdiv"
This reverts commit ea84e7ad638c259397fc36fe6e3d82b9cb3b89d0
2022-09-16 22:48:08 +00:00
Charles Schlosser
ea84e7ad63 Add AVX int32_t pdiv 2022-09-16 17:06:29 +00:00
Rasmus Munk Larsen
bd393e15c3 Vectorize acos, asin, and atan for float. 2022-08-29 19:49:33 +00:00
Charles Schlosser
e5af9f87f2 Vectorize pow for integer base / exponent types 2022-08-29 19:23:54 +00:00
Rasmus Munk Larsen
7064ed1345 Specialize psign<Packet8i> for AVX2, don't vectorize psign<bool>. 2022-08-26 17:02:37 +00:00
Rasmus Munk Larsen
1a09defce7 Protect new pblend implementation with EIGEN_VECTORIZE_AVX2 2022-08-22 18:28:03 +00:00
Matthew Sterrett
7a3b667c43 Add support for AVX512-FP16 for vectorizing half precision math 2022-08-17 18:15:21 +00:00
Ilya Tokar
e618c4a5e9 Improve pblend AVX implementation 2022-07-29 18:45:33 +00:00
Shi, Brian
fc1d888415 Remove AVX512VL dependency in trsm 2022-04-14 12:44:24 -07:00