eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-08-03 10:40:39 +08:00

Author	SHA1	Message	Date
Gael Guennebaud	b477d60bc6	Extend the generic psin_float code to handle cosine and make SSE and AVX use it (-> this adds pcos for AVX)	2018-11-30 11:26:30 +01:00
Gael Guennebaud	fa7fd61eda	Unify SSE/AVX psin functions. It is based on the SSE version which is much more accurate, though very slightly slower. This changeset also includes the following required changes: - add packet-float to packet-int type traits - add packet float<->int reinterpret casts - add faster pselect for AVX based on blendv	2018-11-27 22:41:51 +01:00
Gael Guennebaud	7655a8af6e	cleanup	2018-11-26 23:21:29 +01:00
Gael Guennebaud	502f92fa10	Unify SSE and AVX pexp for double.	2018-11-26 23:12:44 +01:00
Gael Guennebaud	cf8b85d5c5	Unify SSE and AVX implementation of pexp	2018-11-26 16:36:19 +01:00
Gael Guennebaud	2c44c40114	First step toward a unification of packet log implementation, currently only SSE and AVX are unified. To this end, I added the following functions: pzero, pcmp_*, pfrexp, pset1frombits functions.	2018-11-26 14:21:24 +01:00
luz.paz	e3912f5e63	MIsc. source and comment typos Found using `codespell` and `grep` from downstream FreeCAD	2018-03-11 10:01:44 -04:00
Rasmus Munk Larsen	765615609d	Update comment for fast sqrt.	2016-10-04 15:08:41 -07:00
Rasmus Munk Larsen	3ed67cb0bb	Fix a bug in the implementation of Carmack's fast sqrt algorithm in Eigen (enabled by EIGEN_FAST_MATH), which causes the vectorized parts of the computation to return -0.0 instead of NaN for negative arguments. Benchmark speed in Giga-sqrts/s Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz ----------------------------------------- SSE AVX Fast=1 2.529G 4.380G Fast=0 1.944G 1.898G Fast=1 fixed 2.214G 3.739G This table illustrates the worst case in terms speed impact: It was measured by repeatedly computing the sqrt of an n=4096 float vector that fits in L1 cache. For large vectors the operation becomes memory bound and the differences between the different versions almost negligible.	2016-10-04 14:22:56 -07:00
Gael Guennebaud	a4c266f827	Factorize the 4 copies of tanh implementations, make numext::tanh consistent with array::tanh, enable fast tanh in fast-math mode only.	2016-08-23 14:23:08 +02:00
Benoit Steiner	8ce46f9d89	Improved implementation of ptanh for SSE and AVX	2016-02-18 13:24:34 -08:00
Benoit Steiner	6d8b1dce06	Avoid implicit cast from double to float.	2016-02-10 18:07:11 -08:00
Benoit Steiner	bfb3fcd94f	Optimized implementation of the tanh function for SSE	2016-02-10 08:52:30 -08:00
Benoit Jacob	e6ee18d6b4	Make the GCC workaround for sqrt GCC-only; detect Emscripten as non-GCC	2016-02-10 11:11:49 -05:00
Benoit Jacob	964a95bf5e	Work around Emscripten bug - https://github.com/kripken/emscripten/issues/4088	2016-02-10 10:37:22 -05:00
Gael Guennebaud	7cae8918c0	Fix compilation on old gcc+AVX	2016-01-21 20:30:32 +01:00
Gael Guennebaud	8dca9f97e3	Add numext::sqrt function to enable custom optimized implementation. This changeset add two specializations for float/double on SSE. Those are mostly usefull with GCC for which std::sqrt add an extra and costly check on the result of _mm_sqrt_*. Clang does not add this burden. In this changeset, only DenseBase::norm() makes use of it.	2016-01-21 20:18:51 +01:00
Benoit Steiner	4fd7f47692	Added an optimized version of rsqrt for SSE and AVX that is used when EIGEN_FAST_MATH is defined.	2015-03-02 09:38:47 -08:00
Benoit Steiner	f41b1f1666	Added support for fast reciprocal square root computation.	2015-02-26 09:42:41 -08:00
Gael Guennebaud	eb563049f7	Remove some dead stores.	2015-02-18 11:26:48 +01:00
Christoph Hertzberg	84aaa03182	Addendum to bug #859 : pexp(NaN) for double did not return NaN, also, plog(NaN) did not return NaN. psqrt(NaN) and psqrt(-1) shall return NaN if EIGEN_FAST_MATH==0	2014-10-20 13:13:43 +02:00
Gael Guennebaud	aa5f79206f	Fix bug #859 : pexp(NaN) returned Inf instead of NaN	2014-10-20 11:38:51 +02:00
Gael Guennebaud	5c5231ab71	Workaround gcc's default ABI not being able to distinghish between vector types of different sizes.	2014-04-22 16:03:19 +02:00
Gael Guennebaud	9f3f42d66a	fix a few "dead stores" warnings	2013-10-26 13:59:02 +02:00
Gael Guennebaud	c47010e3d2	typo	2013-08-19 16:10:00 +02:00
Gael Guennebaud	d4dd6aaed2	Fix bug #642 : add vectorization of sqrt for doubles, and make sqrt really safe if EIGEN_FAST_MATH is disabled	2013-08-19 16:02:27 +02:00
Gael Guennebaud	9f11f80db1	Make psqrt works with numeric_limits<float>::min	2013-06-14 10:55:05 +02:00
Jeff Dean	d5fa5001a7	Fix bug #613 : psqrt was incorrect for small numbers	2013-06-13 18:17:27 +02:00
Gael Guennebaud	8745da14d8	Fix SSE plog<float> to return -INF on 0	2013-02-14 23:34:05 +01:00
Gael Guennebaud	7d98c864ff	fix warning	2012-08-01 10:44:59 +02:00
Gael Guennebaud	22e0ebbc2c	fix lower acceptable bound of SSE pexp for double	2012-07-31 23:11:04 +02:00
Gael Guennebaud	e8aa1f00c5	add SSE pexp function for double, make use of _mm_floor_p* for pexp with SSE4.1	2012-07-27 23:40:04 +02:00
Benoit Jacob	69124cfca2	Automatic relicensing to MPL2 using Keirs script. Manual fixup follows.	2012-07-13 14:42:47 -04:00
Gael Guennebaud	a3e700db72	fix bug #475 : .exp() now returns +inf when overflow occurs (SSE)	2012-06-14 10:38:39 +02:00
Jitse Niesen	3c412183b2	Get rid of include directives inside namespace blocks (bug #339 ).	2012-04-15 11:06:28 +01:00
Benoit Jacob	4716040703	bug #86 : use internal:: namespace instead of ei_ prefix	2010-10-25 10:15:22 -04:00
Gael Guennebaud	aa2b46aa91	allow vectorization of mat44.col() by adding a InnerPanel boolean template parameter to Block	2010-07-23 16:29:29 +02:00
Benoit Jacob	97ced33b33	Backed out changeset 40f6e26a247976ba1868520a4747e49e0739a42a See thread on mailing list: "InnerPanel change mis-detects alignment?"	2010-08-11 00:04:06 -04:00
Gael Guennebaud	40f6e26a24	allow vectorization of mat44.col() by adding a InnerPanel boolean template parameter to Block	2010-07-23 16:29:29 +02:00
Gael Guennebaud	ff96c94043	mixing types in product step 2: * pload* and pset1 are now templated on the packet type * gemv routines are now embeded into a structure with a consistent API with respect to gemm * some configurations of vector * matrix and matrix * matrix works fine, some need more work...	2010-07-11 15:48:30 +02:00
Gael Guennebaud	28e64b0da3	email change	2010-06-24 23:21:58 +02:00
Benoit Jacob	f0a6d56f07	fix linking errors with multiply defined functions	2010-06-18 09:01:34 -04:00
Benoit Jacob	134ca4acb3	packet math functions: - take const Packet& args like the other packet funcs - SSE specializations: make them be actual template specializations	2010-06-15 08:29:21 -04:00
Hauke Heibel	9d6afdeb22	ei_psqrt fix for zero input	2010-04-01 15:10:52 +02:00
Hauke Heibel	3ea1f97f69	Suppressed the warning for missing assignment generators (forgot that in the last submission). Commented Quake3's fast inverser sqrt in SSE's MathFunction header.	2009-12-15 08:09:14 +01:00
Benoit Jacob	6347b1db5b	remove sentence "Eigen itself is part of the KDE project." it never made very precise sense. but now does it still make any?	2009-05-22 20:25:33 +02:00
Gael Guennebaud	1e286464ab	* compilation fixes for gcc 3.3 * test Part::swap	2009-05-06 08:43:38 +00:00
Benoit Jacob	b60571a193	fix warnings with unused static functions	2009-05-04 12:49:56 +00:00
Gael Guennebaud	c7bb7436f9	make the ei_p* math functions overloads instead of template specializations	2009-04-22 21:35:50 +00:00
Benoit Jacob	0c99de5a17	more patches from Hauke Heibel: compilation/warning fixes from VC++	2009-04-09 17:19:17 +00:00

1 2

53 Commits