eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-08-02 10:10:37 +08:00

Author	SHA1	Message	Date
Gael Guennebaud	b477d60bc6	Extend the generic psin_float code to handle cosine and make SSE and AVX use it (-> this adds pcos for AVX)	2018-11-30 11:26:30 +01:00
Gael Guennebaud	fa7fd61eda	Unify SSE/AVX psin functions. It is based on the SSE version which is much more accurate, though very slightly slower. This changeset also includes the following required changes: - add packet-float to packet-int type traits - add packet float<->int reinterpret casts - add faster pselect for AVX based on blendv	2018-11-27 22:41:51 +01:00
Gael Guennebaud	7655a8af6e	cleanup	2018-11-26 23:21:29 +01:00
Gael Guennebaud	502f92fa10	Unify SSE and AVX pexp for double.	2018-11-26 23:12:44 +01:00
Gael Guennebaud	cf8b85d5c5	Unify SSE and AVX implementation of pexp	2018-11-26 16:36:19 +01:00
Gael Guennebaud	2c44c40114	First step toward a unification of packet log implementation, currently only SSE and AVX are unified. To this end, I added the following functions: pzero, pcmp_*, pfrexp, pset1frombits functions.	2018-11-26 14:21:24 +01:00
Rasmus Munk Larsen	47150af1c8	Fix copy-paste error: Must use _mm256_cmp_ps for AVX.	2016-10-12 08:34:39 -07:00
Rasmus Munk Larsen	7f67e6dfdb	Update comment for fast sqrt.	2016-10-04 15:09:11 -07:00
Rasmus Munk Larsen	3ed67cb0bb	Fix a bug in the implementation of Carmack's fast sqrt algorithm in Eigen (enabled by EIGEN_FAST_MATH), which causes the vectorized parts of the computation to return -0.0 instead of NaN for negative arguments. Benchmark speed in Giga-sqrts/s Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz ----------------------------------------- SSE AVX Fast=1 2.529G 4.380G Fast=0 1.944G 1.898G Fast=1 fixed 2.214G 3.739G This table illustrates the worst case in terms speed impact: It was measured by repeatedly computing the sqrt of an n=4096 float vector that fits in L1 cache. For large vectors the operation becomes memory bound and the differences between the different versions almost negligible.	2016-10-04 14:22:56 -07:00
Gael Guennebaud	a4c266f827	Factorize the 4 copies of tanh implementations, make numext::tanh consistent with array::tanh, enable fast tanh in fast-math mode only.	2016-08-23 14:23:08 +02:00
Benoit Steiner	8ce46f9d89	Improved implementation of ptanh for SSE and AVX	2016-02-18 13:24:34 -08:00
Benoit Steiner	6d8b1dce06	Avoid implicit cast from double to float.	2016-02-10 18:07:11 -08:00
Benoit Steiner	2d523332b3	Optimized implementation of the hyperbolic tangent function for AVX	2016-02-10 08:48:05 -08:00
Gael Guennebaud	d2e288ae50	Workaround compilers that do not even define _mm256_set_m128.	2015-12-24 16:53:43 +01:00
Benoit Steiner	6d777e1bc7	Fixed a typo.	2015-12-18 19:25:50 -08:00
Gael Guennebaud	3abd8470ca	bug #1140 : remove custom definition and use of _mm256_setr_m128	2015-12-18 14:18:59 +01:00
Gael Guennebaud	75861f6650	bug #1069 : fix AVX support on MSVC (use of non portable C-style cast)	2015-09-28 10:08:26 +02:00
Benoit Steiner	1dded10cb7	Added a double-precision implementation of the exp() function for AVX.	2015-05-04 10:42:51 -07:00
Benoit Steiner	0196141938	Fixed the optimized AVX implementation of the fast rsqrt function	2015-03-02 13:49:39 -08:00
Benoit Steiner	4fd7f47692	Added an optimized version of rsqrt for SSE and AVX that is used when EIGEN_FAST_MATH is defined.	2015-03-02 09:38:47 -08:00
Benoit Steiner	f41b1f1666	Added support for fast reciprocal square root computation.	2015-02-26 09:42:41 -08:00
Benoit Steiner	0927801a84	Optimized version of the sin(), exp(), log() and sqrt() function for AVX	2015-02-13 16:07:08 -08:00

22 Commits