Gael Guennebaud
b477d60bc6
Extend the generic psin_float code to handle cosine and make SSE and AVX use it (-> this adds pcos for AVX)
2018-11-30 11:26:30 +01:00
Gael Guennebaud
fa7fd61eda
Unify SSE/AVX psin functions.
...
It is based on the SSE version which is much more accurate, though very slightly slower.
This changeset also includes the following required changes:
- add packet-float to packet-int type traits
- add packet float<->int reinterpret casts
- add faster pselect for AVX based on blendv
2018-11-27 22:41:51 +01:00
Gael Guennebaud
7655a8af6e
cleanup
2018-11-26 23:21:29 +01:00
Gael Guennebaud
502f92fa10
Unify SSE and AVX pexp for double.
2018-11-26 23:12:44 +01:00
Gael Guennebaud
cf8b85d5c5
Unify SSE and AVX implementation of pexp
2018-11-26 16:36:19 +01:00
Gael Guennebaud
2c44c40114
First step toward a unification of packet log implementation, currently only SSE and AVX are unified.
...
To this end, I added the following functions: pzero, pcmp_*, pfrexp, pset1frombits functions.
2018-11-26 14:21:24 +01:00
luz.paz
e3912f5e63
MIsc. source and comment typos
...
Found using `codespell` and `grep` from downstream FreeCAD
2018-03-11 10:01:44 -04:00
Rasmus Munk Larsen
765615609d
Update comment for fast sqrt.
2016-10-04 15:08:41 -07:00
Rasmus Munk Larsen
3ed67cb0bb
Fix a bug in the implementation of Carmack's fast sqrt algorithm in Eigen (enabled by EIGEN_FAST_MATH), which causes the vectorized parts of the computation to return -0.0 instead of NaN for negative arguments.
...
Benchmark speed in Giga-sqrts/s
Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
-----------------------------------------
SSE AVX
Fast=1 2.529G 4.380G
Fast=0 1.944G 1.898G
Fast=1 fixed 2.214G 3.739G
This table illustrates the worst case in terms speed impact: It was measured by repeatedly computing the sqrt of an n=4096 float vector that fits in L1 cache. For large vectors the operation becomes memory bound and the differences between the different versions almost negligible.
2016-10-04 14:22:56 -07:00
Gael Guennebaud
a4c266f827
Factorize the 4 copies of tanh implementations, make numext::tanh consistent with array::tanh, enable fast tanh in fast-math mode only.
2016-08-23 14:23:08 +02:00
Benoit Steiner
8ce46f9d89
Improved implementation of ptanh for SSE and AVX
2016-02-18 13:24:34 -08:00
Benoit Steiner
6d8b1dce06
Avoid implicit cast from double to float.
2016-02-10 18:07:11 -08:00
Benoit Steiner
bfb3fcd94f
Optimized implementation of the tanh function for SSE
2016-02-10 08:52:30 -08:00
Benoit Jacob
e6ee18d6b4
Make the GCC workaround for sqrt GCC-only; detect Emscripten as non-GCC
2016-02-10 11:11:49 -05:00
Benoit Jacob
964a95bf5e
Work around Emscripten bug - https://github.com/kripken/emscripten/issues/4088
2016-02-10 10:37:22 -05:00
Gael Guennebaud
7cae8918c0
Fix compilation on old gcc+AVX
2016-01-21 20:30:32 +01:00
Gael Guennebaud
8dca9f97e3
Add numext::sqrt function to enable custom optimized implementation.
...
This changeset add two specializations for float/double on SSE. Those
are mostly usefull with GCC for which std::sqrt add an extra and costly
check on the result of _mm_sqrt_*. Clang does not add this burden.
In this changeset, only DenseBase::norm() makes use of it.
2016-01-21 20:18:51 +01:00
Benoit Steiner
4fd7f47692
Added an optimized version of rsqrt for SSE and AVX that is used when EIGEN_FAST_MATH is defined.
2015-03-02 09:38:47 -08:00
Benoit Steiner
f41b1f1666
Added support for fast reciprocal square root computation.
2015-02-26 09:42:41 -08:00
Gael Guennebaud
eb563049f7
Remove some dead stores.
2015-02-18 11:26:48 +01:00
Christoph Hertzberg
84aaa03182
Addendum to bug #859 : pexp(NaN) for double did not return NaN, also, plog(NaN) did not return NaN.
...
psqrt(NaN) and psqrt(-1) shall return NaN if EIGEN_FAST_MATH==0
2014-10-20 13:13:43 +02:00
Gael Guennebaud
aa5f79206f
Fix bug #859 : pexp(NaN) returned Inf instead of NaN
2014-10-20 11:38:51 +02:00
Gael Guennebaud
5c5231ab71
Workaround gcc's default ABI not being able to distinghish between vector types of different sizes.
2014-04-22 16:03:19 +02:00
Gael Guennebaud
9f3f42d66a
fix a few "dead stores" warnings
2013-10-26 13:59:02 +02:00
Gael Guennebaud
c47010e3d2
typo
2013-08-19 16:10:00 +02:00
Gael Guennebaud
d4dd6aaed2
Fix bug #642 : add vectorization of sqrt for doubles, and make sqrt really safe if EIGEN_FAST_MATH is disabled
2013-08-19 16:02:27 +02:00
Gael Guennebaud
9f11f80db1
Make psqrt works with numeric_limits<float>::min
2013-06-14 10:55:05 +02:00
Jeff Dean
d5fa5001a7
Fix bug #613 : psqrt was incorrect for small numbers
2013-06-13 18:17:27 +02:00
Gael Guennebaud
8745da14d8
Fix SSE plog<float> to return -INF on 0
2013-02-14 23:34:05 +01:00
Gael Guennebaud
7d98c864ff
fix warning
2012-08-01 10:44:59 +02:00
Gael Guennebaud
22e0ebbc2c
fix lower acceptable bound of SSE pexp for double
2012-07-31 23:11:04 +02:00
Gael Guennebaud
e8aa1f00c5
add SSE pexp function for double, make use of _mm_floor_p* for pexp with SSE4.1
2012-07-27 23:40:04 +02:00
Benoit Jacob
69124cfca2
Automatic relicensing to MPL2 using Keirs script. Manual fixup follows.
2012-07-13 14:42:47 -04:00
Gael Guennebaud
a3e700db72
fix bug #475 : .exp() now returns +inf when overflow occurs (SSE)
2012-06-14 10:38:39 +02:00
Jitse Niesen
3c412183b2
Get rid of include directives inside namespace blocks (bug #339 ).
2012-04-15 11:06:28 +01:00
Benoit Jacob
4716040703
bug #86 : use internal:: namespace instead of ei_ prefix
2010-10-25 10:15:22 -04:00
Gael Guennebaud
aa2b46aa91
allow vectorization of mat44.col() by adding a InnerPanel boolean
...
template parameter to Block
2010-07-23 16:29:29 +02:00
Benoit Jacob
97ced33b33
Backed out changeset 40f6e26a247976ba1868520a4747e49e0739a42a
...
See thread on mailing list: "InnerPanel change mis-detects alignment?"
2010-08-11 00:04:06 -04:00
Gael Guennebaud
40f6e26a24
allow vectorization of mat44.col() by adding a InnerPanel boolean
...
template parameter to Block
2010-07-23 16:29:29 +02:00
Gael Guennebaud
ff96c94043
mixing types in product step 2:
...
* pload* and pset1 are now templated on the packet type
* gemv routines are now embeded into a structure with
a consistent API with respect to gemm
* some configurations of vector * matrix and matrix * matrix works fine,
some need more work...
2010-07-11 15:48:30 +02:00
Gael Guennebaud
28e64b0da3
email change
2010-06-24 23:21:58 +02:00
Benoit Jacob
f0a6d56f07
fix linking errors with multiply defined functions
2010-06-18 09:01:34 -04:00
Benoit Jacob
134ca4acb3
packet math functions:
...
- take const Packet& args like the other packet funcs
- SSE specializations: make them be actual template specializations
2010-06-15 08:29:21 -04:00
Hauke Heibel
9d6afdeb22
ei_psqrt fix for zero input
2010-04-01 15:10:52 +02:00
Hauke Heibel
3ea1f97f69
Suppressed the warning for missing assignment generators (forgot that in the last submission).
...
Commented Quake3's fast inverser sqrt in SSE's MathFunction header.
2009-12-15 08:09:14 +01:00
Benoit Jacob
6347b1db5b
remove sentence "Eigen itself is part of the KDE project."
...
it never made very precise sense. but now does it still make any?
2009-05-22 20:25:33 +02:00
Gael Guennebaud
1e286464ab
* compilation fixes for gcc 3.3
...
* test Part::swap
2009-05-06 08:43:38 +00:00
Benoit Jacob
b60571a193
fix warnings with unused static functions
2009-05-04 12:49:56 +00:00
Gael Guennebaud
c7bb7436f9
make the ei_p* math functions overloads instead of template
...
specializations
2009-04-22 21:35:50 +00:00
Benoit Jacob
0c99de5a17
more patches from Hauke Heibel: compilation/warning fixes from VC++
2009-04-09 17:19:17 +00:00