264 Commits

Author SHA1 Message Date
Antonio Sánchez
b860042263 Add postream for ostream-ing packets more reliably. 2025-04-01 22:12:00 +00:00
Antonio Sanchez
8e32cbf7da Reduce flakiness of test for Eigen::half. 2025-03-23 22:31:25 -07:00
Antonio Sánchez
d935916ac6 Add numext::fma and missing pmadd implementations. 2025-03-23 01:05:53 +00:00
Antonio Sánchez
70f2aead9a Use native _Float16 for AVX512FP16 and update vectorization. 2025-03-19 19:55:26 +00:00
Charles Schlosser
10e62ccd22 Fix x86 complex vectorized fma 2025-03-12 17:06:32 +00:00
Antonio Sánchez
d79bac0d3c Fix boolean scatter and random generation for tensors. 2025-02-25 21:37:09 +00:00
Rasmus Munk Larsen
5064cb7d5e Add test for using pcast on scalars. 2024-11-25 22:27:26 -08:00
Rasmus Munk Larsen
3f067c4850 Add exp2() as a packet op and array method. 2024-10-22 22:09:34 +00:00
Charles Schlosser
9d3d37c5b7 Complex Numtraits::HasSign and nmsub test 2024-08-28 03:02:47 +00:00
Charles Schlosser
fb95e90f7f Add truncation op 2024-04-29 23:45:49 +00:00
Antonio Sánchez
a5e147305b Fix undefined behavior for generating inputs to the predux_mul test. 2024-04-29 20:32:09 +00:00
Antonio Sánchez
dcceb9afec Unbork avx512 preduce_mul on MSVC. 2024-04-26 15:28:03 +00:00
Charles Schlosser
122befe54c Fix "unary minus operator applied to unsigned type, result still unsigned" on MSVC and other stupid warnings 2024-04-12 19:35:04 +00:00
Antonio Sánchez
17f3bf8985 Fix pexp test for ARM. 2024-03-07 00:19:57 +00:00
Antonio Sánchez
3e8e63eb46 Fix packetmath plog test on Windows. 2024-03-06 23:51:47 +00:00
Antonio Sánchez
38fcedaf8e Fix pexp complex test edge-cases. 2024-03-04 17:44:38 +00:00
Charles Schlosser
8a4118746e fix exp complex test: use int instead of index 2024-02-17 03:55:32 +00:00
Charles Schlosser
18a161bf17 fix pexp_complex_test 2024-02-17 03:08:23 +00:00
Damiano Franzò
be06c9ad51 Implement float pexp_complex 2024-02-17 00:26:57 +00:00
Antonio Sánchez
f40ad38fda Fix failure on ARM with latest compilers. 2024-02-14 23:00:56 +00:00
Antonio Sánchez
6ea33f95df Eliminate warning about writing bytes directly to non-trivial type. 2024-02-12 23:27:48 +00:00
Antonio Sánchez
7b87b21910 Fix UB in bool packetmath test. 2024-02-09 19:46:45 +00:00
Antonio Sánchez
a9ddab3e06 Fix a bunch of ODR violations. 2024-01-30 22:38:43 +00:00
Damiano Franzò
7fd7a3f946 Implement plog_complex 2024-01-30 19:06:05 +00:00
Antonio Sánchez
46e9cdb7fe Clang-format tests, examples, libraries, benchmarks, etc. 2023-12-05 21:22:55 +00:00
Charles Schlosser
81b48065ea Fix arm32 float division and related bugs 2023-08-29 00:36:07 +00:00
Pedro Gonnet
17b5b4de58 Add Packet4ui, Packet8ui, and Packet4ul to the SSE/AVX PacketMath.h headers 2023-04-17 23:33:59 +00:00
Antonio Sánchez
394aabb0a3 Fix failing MSVC tests due to compiler bugs. 2023-03-10 22:36:57 +00:00
Rasmus Munk Larsen
ce62177b5b Vectorize atanh & add a missing definition and unit test for atan. 2023-02-21 03:14:05 +00:00
Antonio Sánchez
8588d8c74b Correct pnegate for floating-point zero. 2022-11-15 18:07:23 +00:00
Rasmus Munk Larsen
97e0784dc6 Vectorize the sign operator in Eigen. 2022-08-09 19:54:57 +00:00
Antonio Sánchez
39d22ef46b Fix flaky packetmath_1 test. 2022-08-02 17:42:45 +00:00
Chip Kerchner
84cf3ff18d Add pload_partial, pstore_partial (and unaligned versions), pgather_partial, pscatter_partial, loadPacketPartial and storePacketPartial. 2022-06-27 19:18:00 +00:00
Erik Schultheis
421cbf0866 Replace Eigen type metaprogramming with corresponding std types and make use of alias templates 2022-03-16 16:43:40 +00:00
Antonio Sánchez
711803c427 Skip denormal test if Cond is false. 2022-03-03 04:32:13 +00:00
Antonio Sánchez
9c07e201ff Modified sqrt/rsqrt for denormal handling. 2022-03-02 17:20:47 +00:00
Antonio Sánchez
2ed4bee78f Fix frexp packetmath tests for MSVC. 2022-02-24 22:16:37 +00:00
Antonio Sánchez
3d7e2d0e3e Fix packetmath compilation error. 2022-02-23 23:27:08 +00:00
Antonio Sánchez
8970719771 Fix gcc-5 packetmath_12 bug. 2022-02-23 21:56:25 +00:00
Rasmus Munk Larsen
8b875dbef1 Changes to fast SQRT/RSQRT 2022-02-23 17:32:21 +00:00
Antonio Sánchez
28e008b99a Fix sqrt/rsqrt for NEON. 2022-02-15 21:31:51 +00:00
Rasmus Munk Larsen
979fdd58a4 Add generic fast psqrt and prsqrt impls and make them correct for 0, +Inf, NaN, and negative arguments. 2022-02-05 00:20:13 +00:00
Rasmus Munk Larsen
7db0ac977a Remove extraneous ")". 2022-01-27 02:20:03 +00:00
Rasmus Munk Larsen
09c0085a57 Only test pmsub, pnmadd, and pnmsub on signed types. 2022-01-27 02:09:25 +00:00
Rasmus Munk Larsen
8f2c6f0aa6 Make preciprocal IEEE compliant w.r.t. 1/0 and 1/inf. 2022-01-26 20:38:05 +00:00
Rasmus Munk Larsen
51311ec651 Remove inline assembly for FMA (AVX) and add remaining extensions as packet ops: pmsub, pnmadd, and pnmsub. 2022-01-26 04:25:41 +00:00
Rasmus Munk Larsen
ea2c02060c Add reciprocal packet op and fast specializations for float with SSE, AVX, and AVX512. 2022-01-21 23:49:18 +00:00
Rasmus Munk Larsen
96dc37a03b Some fixes/cleanups for numeric_limits & fix for related bug in psqrt 2022-01-07 01:10:17 +00:00
Rasmus Munk Larsen
7b5a8b6bc5 Improve plog: 20% speedup for float + handle denormals 2022-01-05 23:40:31 +00:00
Rasmus Munk Larsen
f04fd8b168 Make sure exp(-Inf) is zero for vectorized expressions. This fixes #2385. 2021-12-08 17:57:23 +00:00