Gael Guennebaud
0f780bb0b4
Fix float-to-double warning
2018-10-16 09:19:45 +02:00
Gael Guennebaud
43633fbaba
Fix warning with AVX512f
2018-10-11 10:13:48 +02:00
Gael Guennebaud
97e2c808e9
Fix avx512 plog(NaN) to return NaN instead of +inf
2018-10-11 10:13:13 +02:00
Gael Guennebaud
b3f66d29a5
Enable avx512 plog with clang
2018-10-11 10:12:21 +02:00
Gael Guennebaud
626942d9dd
fix alignment issue in ploaddup for AVX512
2018-09-28 16:57:32 +02:00
Gael Guennebaud
5a30eed17e
Fix warnings in AVX512
2018-09-20 16:58:51 +02:00
Christoph Hertzberg
ad4a08fb68
Use Intel cast intrinsics, since MSVC does not allow direct casting.
...
Reported by David Winkler.
2018-08-24 19:04:33 +02:00
Mark D Ryan
bc615e4585
Re-enable FMA for fast sqrt functions
2018-07-30 13:21:00 +02:00
Mark D Ryan
e79c5149bf
Fix AVX512 implementations of psqrt
...
This commit fixes the AVX512 implementations of psqrt in the same
way that 3ed67cb0bb4af65fbf243df598604a8c7630bf7d
fixed the AVX2 version of this function. The
AVX512 versions of psqrt incorrectly return -0.0 for negative
values, instead of NaN. Fixing the issues requires adding
some additional instructions that slow down the algorithms. A
similar test to the one used in 3ed67cb0bb4af65fbf243df598604a8c7630bf7d
shows that the
corrected Packet16f code runs at 73% of the speed of the existing code,
while the corrected Packed8d function runs at 68% of the original.
2018-06-25 05:05:02 -07:00
Gael Guennebaud
7134fa7a2e
Fix compilation with MSVC by reverting to char* for _mm_prefetch except for PGI (the later being the one that has the wrong prototype).
2018-06-07 09:33:10 +02:00
Jayaram Bobba
b7b868d1c4
fix AVX512 plog
2018-04-20 13:39:18 -07:00
Gael Guennebaud
40b4bf3d32
AVX512: _mm512_rsqrt28_ps is available for AVX512ER only
2018-04-03 14:36:27 +02:00
Gael Guennebaud
584951ca4d
Rename predux_downto4 to be more accurate on its semantic.
2018-04-03 14:28:38 +02:00
Gael Guennebaud
7b0630315f
AVX512: fix psqrt and prsqrt
2018-04-03 14:12:50 +02:00
Gael Guennebaud
6719409cd9
AVX512: add missing pinsertfirst and pinsertlast, implement pblend for Packet8d, fix compilation without AVX512DQ
2018-04-03 14:11:56 +02:00
Rasmus Munk Larsen
7b6aaa3440
Fix NaN propagation for AVX512.
2017-01-24 13:37:08 -08:00
Benoit Steiner
354baa0fb1
Avoid using horizontal adds since they're not very efficient.
2016-12-21 20:55:07 -08:00
Benoit Steiner
d7825b6707
Use native AVX512 types instead of Eigen Packets whenever possible.
2016-12-21 20:06:18 -08:00
Benoit Steiner
923acadfac
Fixed compilation errors with gcc6 when compiling the AVX512 intrinsics
2016-12-19 13:02:27 -08:00
Benoit Steiner
507b661106
Renamed predux_half into predux_downto4
2016-10-06 17:57:04 -07:00
Benoit Steiner
a498ff7df6
Fixed incorrect comment
2016-10-06 15:27:27 -07:00
Benoit Steiner
a7473d6d5a
Fixed compilation error with gcc >= 5.3
2016-10-06 14:33:22 -07:00
Benoit Steiner
5e64cea896
Silenced a compilation warning
2016-10-06 14:24:17 -07:00
Benoit Steiner
4131074818
Deleted unecessary CMakeLists.txt file
2016-10-05 18:54:35 -07:00
Benoit Steiner
cb5cd69872
Silenced a compilation warning.
2016-10-05 18:50:53 -07:00
Benoit Steiner
9c2b6c049b
Silenced a few compilation warnings
2016-10-05 18:37:31 -07:00
Benoit Steiner
fa5a8f055a
Implemented palign_impl for AVX512
2016-04-29 13:30:13 -07:00
Benoit Steiner
ef3ac9d05a
Fixed the AVX512 packet traits
2016-04-29 13:28:36 -07:00
Benoit Steiner
d7b75e8d86
Added pdiv packet primitives for avx512
2016-04-29 13:26:47 -07:00
Benoit Steiner
5e89ded685
Implemented preduxp for AVX512
2016-04-29 13:00:33 -07:00
Benoit Steiner
5f85662ad8
Implemented the pabs and preverse primitives for avx512.
2016-04-29 12:53:34 -07:00
Benoit Steiner
d37ee89ca8
Disabled some of the AVX512 primitives on compilers that don't support them
2016-04-29 12:50:29 -07:00
Benoit Steiner
8bfe739cd2
Updated the AVX512 PacketMath to properly leverage the AVX512DQ instructions
2016-04-11 18:40:16 -07:00
Benoit Steiner
3ca1ae2bb7
Commented out the version of pexp<Packet8d> since it fails to compile with gcc 5.3
2016-02-04 13:49:06 -08:00
Benoit Steiner
23f69ab936
Added implementations of pexp, plog, psqrt, and prsqrt optimized for AVX512
2016-02-04 10:36:36 -08:00
Benoit Steiner
6c9cf117c1
Fixed indentation
2016-02-04 10:34:10 -08:00
Benoit Steiner
85b6d82b49
Generalized predux4 to support AVX512 packets, and renamed it predux_half.
...
Disabled the implementation of pabs for avx512 since the corresponding intrinsics are not shipped with gcc
2016-02-01 14:35:51 -08:00
Benoit Steiner
0366478df8
Added alignment requirement to the AVX512 packet traits.
2016-01-14 17:02:39 -08:00
Benoit Steiner
3cfd16f3af
Fixed the signature of the plset primitives for AVX512
2016-01-14 16:58:01 -08:00
Benoit Steiner
67f44365ea
Fixed the AVX512 signature of the ptranspose primitives
2016-01-14 16:51:11 -08:00
Benoit Steiner
a282eb1363
pscatter/pgather use Index instead of int to specify the stride
2016-01-14 16:39:39 -08:00
Benoit Steiner
7832485575
Deleted unnecessary commas and semicolons
2016-01-14 16:36:29 -08:00
Benoit Steiner
b74887d5f2
Implemented most of the packet primitives for AVX512
2015-12-21 11:46:36 -08:00
Benoit Steiner
9a415fb1e2
Preliminary support for AVX512
2015-12-10 15:34:57 -08:00