eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-07-03 03:35:11 +08:00

Author	SHA1	Message	Date
Rasmus Munk Larsen	a3298b22ec	Implement vectorized versions of log1p and expm1 in Eigen using Kahan's formulas, and change the scalar implementations to properly handle infinite arguments. Depending on instruction set, significant speedups are observed for the vectorized path: log1p wall time is reduced 60-93% (2.5x - 15x speedup) expm1 wall time is reduced 0-85% (1x - 7x speedup) The scalar path is slower by 20-30% due to the extra branch needed to handle +infinity correctly. Full benchmarks measured on Intel(R) Xeon(R) Gold 6154 here: https://bitbucket.org/snippets/rmlarsen/MXBkpM	2019-08-12 13:53:28 -07:00
Gael Guennebaud	3492a1ca74	fix plog(+inf) with AVX512	2019-01-09 16:53:37 +01:00
Gael Guennebaud	fa87f9d876	Add psin/pcos on AVX512 -> almost for free, at last!	2018-11-30 14:33:13 +01:00
Gael Guennebaud	0f780bb0b4	Fix float-to-double warning	2018-10-16 09:19:45 +02:00
Gael Guennebaud	97e2c808e9	Fix avx512 plog(NaN) to return NaN instead of +inf	2018-10-11 10:13:13 +02:00
Gael Guennebaud	b3f66d29a5	Enable avx512 plog with clang	2018-10-11 10:12:21 +02:00
Mark D Ryan	bc615e4585	Re-enable FMA for fast sqrt functions	2018-07-30 13:21:00 +02:00
Mark D Ryan	e79c5149bf	Fix AVX512 implementations of psqrt This commit fixes the AVX512 implementations of psqrt in the same way that 3ed67cb0bb4af65fbf243df598604a8c7630bf7d fixed the AVX2 version of this function. The AVX512 versions of psqrt incorrectly return -0.0 for negative values, instead of NaN. Fixing the issues requires adding some additional instructions that slow down the algorithms. A similar test to the one used in 3ed67cb0bb4af65fbf243df598604a8c7630bf7d shows that the corrected Packet16f code runs at 73% of the speed of the existing code, while the corrected Packed8d function runs at 68% of the original.	2018-06-25 05:05:02 -07:00
Jayaram Bobba	b7b868d1c4	fix AVX512 plog	2018-04-20 13:39:18 -07:00
Gael Guennebaud	40b4bf3d32	AVX512: _mm512_rsqrt28_ps is available for AVX512ER only	2018-04-03 14:36:27 +02:00
Gael Guennebaud	7b0630315f	AVX512: fix psqrt and prsqrt	2018-04-03 14:12:50 +02:00
Benoit Steiner	d37ee89ca8	Disabled some of the AVX512 primitives on compilers that don't support them	2016-04-29 12:50:29 -07:00
Benoit Steiner	3ca1ae2bb7	Commented out the version of pexp<Packet8d> since it fails to compile with gcc 5.3	2016-02-04 13:49:06 -08:00
Benoit Steiner	23f69ab936	Added implementations of pexp, plog, psqrt, and prsqrt optimized for AVX512	2016-02-04 10:36:36 -08:00

14 Commits