Antonio Sánchez
|
b860042263
|
Add postream for ostream-ing packets more reliably.
|
2025-04-01 22:12:00 +00:00 |
|
Antonio Sanchez
|
8e32cbf7da
|
Reduce flakiness of test for Eigen::half.
|
2025-03-23 22:31:25 -07:00 |
|
Antonio Sánchez
|
d935916ac6
|
Add numext::fma and missing pmadd implementations.
|
2025-03-23 01:05:53 +00:00 |
|
Antonio Sánchez
|
70f2aead9a
|
Use native _Float16 for AVX512FP16 and update vectorization.
|
2025-03-19 19:55:26 +00:00 |
|
Charles Schlosser
|
10e62ccd22
|
Fix x86 complex vectorized fma
|
2025-03-12 17:06:32 +00:00 |
|
Antonio Sánchez
|
d79bac0d3c
|
Fix boolean scatter and random generation for tensors.
|
2025-02-25 21:37:09 +00:00 |
|
Rasmus Munk Larsen
|
5064cb7d5e
|
Add test for using pcast on scalars.
|
2024-11-25 22:27:26 -08:00 |
|
Rasmus Munk Larsen
|
3f067c4850
|
Add exp2() as a packet op and array method.
|
2024-10-22 22:09:34 +00:00 |
|
Charles Schlosser
|
9d3d37c5b7
|
Complex Numtraits::HasSign and nmsub test
|
2024-08-28 03:02:47 +00:00 |
|
Charles Schlosser
|
fb95e90f7f
|
Add truncation op
|
2024-04-29 23:45:49 +00:00 |
|
Antonio Sánchez
|
a5e147305b
|
Fix undefined behavior for generating inputs to the predux_mul test.
|
2024-04-29 20:32:09 +00:00 |
|
Antonio Sánchez
|
dcceb9afec
|
Unbork avx512 preduce_mul on MSVC.
|
2024-04-26 15:28:03 +00:00 |
|
Charles Schlosser
|
122befe54c
|
Fix "unary minus operator applied to unsigned type, result still unsigned" on MSVC and other stupid warnings
|
2024-04-12 19:35:04 +00:00 |
|
Antonio Sánchez
|
17f3bf8985
|
Fix pexp test for ARM.
|
2024-03-07 00:19:57 +00:00 |
|
Antonio Sánchez
|
3e8e63eb46
|
Fix packetmath plog test on Windows.
|
2024-03-06 23:51:47 +00:00 |
|
Antonio Sánchez
|
38fcedaf8e
|
Fix pexp complex test edge-cases.
|
2024-03-04 17:44:38 +00:00 |
|
Charles Schlosser
|
8a4118746e
|
fix exp complex test: use int instead of index
|
2024-02-17 03:55:32 +00:00 |
|
Charles Schlosser
|
18a161bf17
|
fix pexp_complex_test
|
2024-02-17 03:08:23 +00:00 |
|
Damiano Franzò
|
be06c9ad51
|
Implement float pexp_complex
|
2024-02-17 00:26:57 +00:00 |
|
Antonio Sánchez
|
f40ad38fda
|
Fix failure on ARM with latest compilers.
|
2024-02-14 23:00:56 +00:00 |
|
Antonio Sánchez
|
6ea33f95df
|
Eliminate warning about writing bytes directly to non-trivial type.
|
2024-02-12 23:27:48 +00:00 |
|
Antonio Sánchez
|
7b87b21910
|
Fix UB in bool packetmath test.
|
2024-02-09 19:46:45 +00:00 |
|
Antonio Sánchez
|
a9ddab3e06
|
Fix a bunch of ODR violations.
|
2024-01-30 22:38:43 +00:00 |
|
Damiano Franzò
|
7fd7a3f946
|
Implement plog_complex
|
2024-01-30 19:06:05 +00:00 |
|
Antonio Sánchez
|
46e9cdb7fe
|
Clang-format tests, examples, libraries, benchmarks, etc.
|
2023-12-05 21:22:55 +00:00 |
|
Charles Schlosser
|
81b48065ea
|
Fix arm32 float division and related bugs
|
2023-08-29 00:36:07 +00:00 |
|
Pedro Gonnet
|
17b5b4de58
|
Add Packet4ui , Packet8ui , and Packet4ul to the SSE /AVX PacketMath.h headers
|
2023-04-17 23:33:59 +00:00 |
|
Antonio Sánchez
|
394aabb0a3
|
Fix failing MSVC tests due to compiler bugs.
|
2023-03-10 22:36:57 +00:00 |
|
Rasmus Munk Larsen
|
ce62177b5b
|
Vectorize atanh & add a missing definition and unit test for atan.
|
2023-02-21 03:14:05 +00:00 |
|
Antonio Sánchez
|
8588d8c74b
|
Correct pnegate for floating-point zero.
|
2022-11-15 18:07:23 +00:00 |
|
Rasmus Munk Larsen
|
97e0784dc6
|
Vectorize the sign operator in Eigen.
|
2022-08-09 19:54:57 +00:00 |
|
Antonio Sánchez
|
39d22ef46b
|
Fix flaky packetmath_1 test.
|
2022-08-02 17:42:45 +00:00 |
|
Chip Kerchner
|
84cf3ff18d
|
Add pload_partial, pstore_partial (and unaligned versions), pgather_partial, pscatter_partial, loadPacketPartial and storePacketPartial.
|
2022-06-27 19:18:00 +00:00 |
|
Erik Schultheis
|
421cbf0866
|
Replace Eigen type metaprogramming with corresponding std types and make use of alias templates
|
2022-03-16 16:43:40 +00:00 |
|
Antonio Sánchez
|
711803c427
|
Skip denormal test if Cond is false.
|
2022-03-03 04:32:13 +00:00 |
|
Antonio Sánchez
|
9c07e201ff
|
Modified sqrt/rsqrt for denormal handling.
|
2022-03-02 17:20:47 +00:00 |
|
Antonio Sánchez
|
2ed4bee78f
|
Fix frexp packetmath tests for MSVC.
|
2022-02-24 22:16:37 +00:00 |
|
Antonio Sánchez
|
3d7e2d0e3e
|
Fix packetmath compilation error.
|
2022-02-23 23:27:08 +00:00 |
|
Antonio Sánchez
|
8970719771
|
Fix gcc-5 packetmath_12 bug.
|
2022-02-23 21:56:25 +00:00 |
|
Rasmus Munk Larsen
|
8b875dbef1
|
Changes to fast SQRT/RSQRT
|
2022-02-23 17:32:21 +00:00 |
|
Antonio Sánchez
|
28e008b99a
|
Fix sqrt/rsqrt for NEON.
|
2022-02-15 21:31:51 +00:00 |
|
Rasmus Munk Larsen
|
979fdd58a4
|
Add generic fast psqrt and prsqrt impls and make them correct for 0, +Inf, NaN, and negative arguments.
|
2022-02-05 00:20:13 +00:00 |
|
Rasmus Munk Larsen
|
7db0ac977a
|
Remove extraneous ")".
|
2022-01-27 02:20:03 +00:00 |
|
Rasmus Munk Larsen
|
09c0085a57
|
Only test pmsub, pnmadd, and pnmsub on signed types.
|
2022-01-27 02:09:25 +00:00 |
|
Rasmus Munk Larsen
|
8f2c6f0aa6
|
Make preciprocal IEEE compliant w.r.t. 1/0 and 1/inf.
|
2022-01-26 20:38:05 +00:00 |
|
Rasmus Munk Larsen
|
51311ec651
|
Remove inline assembly for FMA (AVX) and add remaining extensions as packet ops: pmsub, pnmadd, and pnmsub.
|
2022-01-26 04:25:41 +00:00 |
|
Rasmus Munk Larsen
|
ea2c02060c
|
Add reciprocal packet op and fast specializations for float with SSE, AVX, and AVX512.
|
2022-01-21 23:49:18 +00:00 |
|
Rasmus Munk Larsen
|
96dc37a03b
|
Some fixes/cleanups for numeric_limits & fix for related bug in psqrt
|
2022-01-07 01:10:17 +00:00 |
|
Rasmus Munk Larsen
|
7b5a8b6bc5
|
Improve plog: 20% speedup for float + handle denormals
|
2022-01-05 23:40:31 +00:00 |
|
Rasmus Munk Larsen
|
f04fd8b168
|
Make sure exp(-Inf) is zero for vectorized expressions. This fixes #2385.
|
2021-12-08 17:57:23 +00:00 |
|