Rasmus Munk Larsen
3ed67cb0bb
Fix a bug in the implementation of Carmack's fast sqrt algorithm in Eigen (enabled by EIGEN_FAST_MATH), which causes the vectorized parts of the computation to return -0.0 instead of NaN for negative arguments.
...
Benchmark speed in Giga-sqrts/s
Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
-----------------------------------------
SSE AVX
Fast=1 2.529G 4.380G
Fast=0 1.944G 1.898G
Fast=1 fixed 2.214G 3.739G
This table illustrates the worst case in terms speed impact: It was measured by repeatedly computing the sqrt of an n=4096 float vector that fits in L1 cache. For large vectors the operation becomes memory bound and the differences between the different versions almost negligible.
2016-10-04 14:22:56 -07:00
Gael Guennebaud
a4c266f827
Factorize the 4 copies of tanh implementations, make numext::tanh consistent with array::tanh, enable fast tanh in fast-math mode only.
2016-08-23 14:23:08 +02:00
Benoit Steiner
8ce46f9d89
Improved implementation of ptanh for SSE and AVX
2016-02-18 13:24:34 -08:00
Benoit Steiner
6d8b1dce06
Avoid implicit cast from double to float.
2016-02-10 18:07:11 -08:00
Benoit Steiner
2d523332b3
Optimized implementation of the hyperbolic tangent function for AVX
2016-02-10 08:48:05 -08:00
Gael Guennebaud
d2e288ae50
Workaround compilers that do not even define _mm256_set_m128.
2015-12-24 16:53:43 +01:00
Benoit Steiner
6d777e1bc7
Fixed a typo.
2015-12-18 19:25:50 -08:00
Gael Guennebaud
3abd8470ca
bug #1140 : remove custom definition and use of _mm256_setr_m128
2015-12-18 14:18:59 +01:00
Gael Guennebaud
75861f6650
bug #1069 : fix AVX support on MSVC (use of non portable C-style cast)
2015-09-28 10:08:26 +02:00
Benoit Steiner
1dded10cb7
Added a double-precision implementation of the exp() function for AVX.
2015-05-04 10:42:51 -07:00
Benoit Steiner
0196141938
Fixed the optimized AVX implementation of the fast rsqrt function
2015-03-02 13:49:39 -08:00
Benoit Steiner
4fd7f47692
Added an optimized version of rsqrt for SSE and AVX that is used when EIGEN_FAST_MATH is defined.
2015-03-02 09:38:47 -08:00
Benoit Steiner
f41b1f1666
Added support for fast reciprocal square root computation.
2015-02-26 09:42:41 -08:00
Benoit Steiner
0927801a84
Optimized version of the sin(), exp(), log() and sqrt() function for AVX
2015-02-13 16:07:08 -08:00