Antonio Sanchez
f149e0ebc3
Fix MSVC complex sqrt and packetmath test.
...
MSVC incorrectly handles `inf` cases for `std::sqrt<std::complex<T>>`.
Here we replace it with a custom version (currently used on GPU).
Also fixed the `packetmath` test, which previously skipped several
corner cases since `CHECK_CWISE1` only tests the first `PacketSize`
elements.
2021-01-08 01:17:19 +00:00
Rasmus Munk Larsen
4e4d3f32d1
Clean up packetmath tests and fix various bugs to make bfloat16 pass (almost) all packetmath tests with SSE, AVX, and AVX512.
2020-10-09 20:05:49 +00:00
Pedro Caldeira
35d149e34c
Add missing functions for Packet8bf in Altivec architecture.
...
Including new tests for bfloat16 Packets.
Fix prsqrt on GenericPacketMath.
2020-09-08 09:22:11 -05:00
Teng Lu
386d809bde
Support BFloat16 in Eigen
2020-06-20 19:16:24 +00:00
Rasmus Munk Larsen
c1d944dd91
Remove packet ops pinsertfirst and pinsertlast that are only used in a single place, and can be replaced by other ops when constructing the first/final packet in linspaced_op_impl::packetOp.
...
I cannot measure any performance changes for SSE, AVX, or AVX512.
name old time/op new time/op delta
BM_LinSpace<float>/1 1.63ns ± 0% 1.63ns ± 0% ~ (p=0.762 n=5+5)
BM_LinSpace<float>/8 4.92ns ± 3% 4.89ns ± 3% ~ (p=0.421 n=5+5)
BM_LinSpace<float>/64 34.6ns ± 0% 34.6ns ± 0% ~ (p=0.841 n=5+5)
BM_LinSpace<float>/512 217ns ± 0% 217ns ± 0% ~ (p=0.421 n=5+5)
BM_LinSpace<float>/4k 1.68µs ± 0% 1.68µs ± 0% ~ (p=1.000 n=5+5)
BM_LinSpace<float>/32k 13.3µs ± 0% 13.3µs ± 0% ~ (p=0.905 n=5+4)
BM_LinSpace<float>/256k 107µs ± 0% 107µs ± 0% ~ (p=0.841 n=5+5)
BM_LinSpace<float>/1M 427µs ± 0% 427µs ± 0% ~ (p=0.690 n=5+5)
2020-05-08 15:41:50 -07:00
Christoph Hertzberg
35219cea68
Bug #1790 : Make areApprox
check numext::isnan
instead of bitwise equality (NaNs don't have to be bitwise equal).
2020-01-11 14:57:22 +01:00
Srinivas Vasudevan
2e099e8d8f
Added special_packetmath test and tweaked bounds on tests.
...
Refactor shared packetmath code to header file.
(Squashed from PR !38 )
2020-01-11 10:31:21 +00:00