Antonio Sánchez
|
d935916ac6
|
Add numext::fma and missing pmadd implementations.
|
2025-03-23 01:05:53 +00:00 |
|
Antonio Sánchez
|
70f2aead9a
|
Use native _Float16 for AVX512FP16 and update vectorization.
|
2025-03-19 19:55:26 +00:00 |
|
Rasmus Munk Larsen
|
6c6ce9d06b
|
Enable vectorized erf<double>(x) for SSE and AVX, which was accidentally removed in merge request 1750.
|
2024-11-19 22:14:29 +00:00 |
|
Rasmus Munk Larsen
|
8ee6f8475a
|
Speed up exp(x).
|
2024-11-19 17:50:34 +00:00 |
|
Rasmus Munk Larsen
|
5133c836c0
|
Vectorize erf(x) for double.
|
2024-11-16 19:05:16 +00:00 |
|
Rasmus Munk Larsen
|
0d366f6532
|
Vectorize erfc(x) for double and improve erfc(x) for float.
|
2024-11-08 17:21:11 +00:00 |
|
Charles Schlosser
|
8adf43640e
|
more avx predux_any
|
2024-11-07 19:58:48 +00:00 |
|
Charles Schlosser
|
bc424f617a
|
add missing avx predux_any functions
|
2024-11-07 19:11:29 +00:00 |
|
Rasmus Munk Larsen
|
bbdabebf44
|
Vectorize atanh<double>. Make atanh(x) standard compliant for |x| >= 1.
|
2024-08-30 17:27:55 +00:00 |
|
Rasmus Munk Larsen
|
32d95bb097
|
Add vectorized implementation of tanh<double>
|
2024-08-21 02:29:45 +00:00 |
|
Frédéric Chapoton
|
6331da95eb
|
fixing a lot of typos
|
2024-07-30 22:15:49 +00:00 |
|
Charles Schlosser
|
fb95e90f7f
|
Add truncation op
|
2024-04-29 23:45:49 +00:00 |
|
Charles Schlosser
|
5635d37f46
|
more pblend optimizations
|
2024-04-19 02:02:27 +00:00 |
|
Rasmus Munk Larsen
|
b5feca5d03
|
Fix build for pblend and psin_double, pcos_double when AVX but not AVX2 is supported.
|
2024-04-16 16:12:41 +00:00 |
|
Damiano Franzò
|
888fca0e2b
|
Simd sincos double
|
2024-04-15 21:12:32 +00:00 |
|
Charles Schlosser
|
6ad2ccea4e
|
Eigen pblend
|
2024-04-15 16:19:53 +00:00 |
|
Rasmus Munk Larsen
|
3c9109238f
|
Add support for Packet8l to AVX512.
|
2024-04-09 22:58:44 +00:00 |
|
Rasmus Munk Larsen
|
283d69294b
|
Guard AVX2 implementation of psignbit in PacketMath.h
|
2024-04-03 21:03:26 +00:00 |
|
Charles Schlosser
|
776d86d8df
|
AVX: guard Packet4l definition
|
2024-04-01 00:31:46 +00:00 |
|
Charles Schlosser
|
f75e2297db
|
AVX2 - double->int64_t casting
|
2024-03-29 21:35:09 +00:00 |
|
Antonio Sánchez
|
c8d368bdaf
|
More fixes for 32-bit.
|
2024-03-26 22:53:38 +00:00 |
|
Rasmus Munk Larsen
|
b86641a4c2
|
Add support for casting between double and int64_t for SSE and AVX2.
|
2024-03-22 22:32:29 +00:00 |
|
Antonio Sánchez
|
d883932586
|
Fix Packet*l for 32-bit builds.
|
2024-03-22 17:16:42 +00:00 |
|
Tobias Wood
|
f38e16c193
|
Apply clang-format
|
2023-11-29 11:12:48 +00:00 |
|
Antonio Sánchez
|
6e4d5d4832
|
Add IWYU private pragmas to internal headers.
|
2023-08-21 16:25:22 +00:00 |
|
Kevin Leonardic
|
d4b05454a7
|
Fix argument for _mm256_cvtps_ph imm parameter
|
2023-07-03 13:44:20 +02:00 |
|
Charles Schlosser
|
969c31eefc
|
Fix AVX pstore
|
2023-06-15 01:47:38 +00:00 |
|
Charles Schlosser
|
8999525c29
|
AVX2: Packet4ul has pmul, abs2
|
2023-04-26 16:22:16 +00:00 |
|
Charles Schlosser
|
f6cf5dca80
|
Packet4ul does not have Abs2
|
2023-04-21 19:48:01 +00:00 |
|
Charles Schlosser
|
29c8e3c754
|
fix pow for uint32_t, disable pmul<Packet4ul>
|
2023-04-21 05:47:56 +00:00 |
|
Pedro Gonnet
|
17b5b4de58
|
Add Packet4ui , Packet8ui , and Packet4ul to the SSE /AVX PacketMath.h headers
|
2023-04-17 23:33:59 +00:00 |
|
Rasmus Munk Larsen
|
df1049ddf4
|
Small packet math cleanup.
|
2023-04-04 16:14:32 +00:00 |
|
Rasmus Munk Larsen
|
0488b708b4
|
Vectorize tensor.isnan() by using typed predicates.
|
2023-03-16 04:04:22 +00:00 |
|
Antonio Sánchez
|
394aabb0a3
|
Fix failing MSVC tests due to compiler bugs.
|
2023-03-10 22:36:57 +00:00 |
|
Rasmus Munk Larsen
|
ce62177b5b
|
Vectorize atanh & add a missing definition and unit test for atan.
|
2023-02-21 03:14:05 +00:00 |
|
Sean McBride
|
d70b4864d9
|
issue #2581: review and cleanup of compiler version checks
|
2023-01-17 18:58:34 +00:00 |
|
Charles Schlosser
|
02805bd56c
|
Fix AVX2 psignbit
|
2022-11-16 13:43:11 +00:00 |
|
Antonio Sánchez
|
8588d8c74b
|
Correct pnegate for floating-point zero.
|
2022-11-15 18:07:23 +00:00 |
|
Charles Schlosser
|
82b152dbe7
|
Add signbit function
|
2022-11-04 00:31:20 +00:00 |
|
Rasmus Munk Larsen
|
c475228b28
|
Vectorize atan() for double.
|
2022-10-01 01:49:30 +00:00 |
|
Rasmus Munk Larsen
|
7b2901e2aa
|
Add vectorized integer division for int32 with AVX512, AVX or SSE.
|
2022-09-21 00:27:23 +00:00 |
|
Rasmus Munk Larsen
|
f913a40678
|
Revert "Add AVX int32_t pdiv"
This reverts commit ea84e7ad638c259397fc36fe6e3d82b9cb3b89d0
|
2022-09-16 22:48:08 +00:00 |
|
Charles Schlosser
|
ea84e7ad63
|
Add AVX int32_t pdiv
|
2022-09-16 17:06:29 +00:00 |
|
Rasmus Munk Larsen
|
bd393e15c3
|
Vectorize acos, asin, and atan for float.
|
2022-08-29 19:49:33 +00:00 |
|
Charles Schlosser
|
e5af9f87f2
|
Vectorize pow for integer base / exponent types
|
2022-08-29 19:23:54 +00:00 |
|
Rasmus Munk Larsen
|
7064ed1345
|
Specialize psign<Packet8i> for AVX2, don't vectorize psign<bool>.
|
2022-08-26 17:02:37 +00:00 |
|
Rasmus Munk Larsen
|
1a09defce7
|
Protect new pblend implementation with EIGEN_VECTORIZE_AVX2
|
2022-08-22 18:28:03 +00:00 |
|
Matthew Sterrett
|
7a3b667c43
|
Add support for AVX512-FP16 for vectorizing half precision math
|
2022-08-17 18:15:21 +00:00 |
|
Ilya Tokar
|
e618c4a5e9
|
Improve pblend AVX implementation
|
2022-07-29 18:45:33 +00:00 |
|
Shi, Brian
|
fc1d888415
|
Remove AVX512VL dependency in trsm
|
2022-04-14 12:44:24 -07:00 |
|