66 Commits

Author SHA1 Message Date
Hauke Heibel
4365a48748 Added an ei_linspaced_op to create linearly spaced vectors.
Added setLinSpaced/LinSpaced functionality to DenseBase.
Improved vectorized assignment - overcomes MSVC optimization issues.
CwiseNullaryOp is now requiring functors to offer 1D and 2D operators.
Adapted existing functors to the new CwiseNullaryOp requirements.
Added ei_plset to create packages as [a, a+1, ..., a+size].
Added more nullaray unit tests.
2010-01-26 19:42:17 +01:00
Hauke Heibel
325da2ea3c Fixed conservativeResize.
Fixed multiple overloads for operator=.
Removed debug output.
2010-01-11 13:57:50 +01:00
Gael Guennebaud
eaaba30cac merge with default branch 2009-12-22 22:51:08 +01:00
Gael Guennebaud
6db6774c46 * fix aliasing checks when the lhs is also transposed. At the same time,
significantly simplify the code of these checks while extending them
  to catch much more expressions!
* move the enabling/disabling of vectorized sin/cos to the architecture traits
2009-12-16 11:41:16 +01:00
Hauke Heibel
3ea1f97f69 Suppressed the warning for missing assignment generators (forgot that in the last submission).
Commented Quake3's fast inverser sqrt in SSE's MathFunction header.
2009-12-15 08:09:14 +01:00
Benoit Jacob
684d76eba3 add SSE4 support, start with integer multiplication 2009-11-24 15:12:43 -05:00
Gael Guennebaud
eb8f450071 Hey, finally the copyCoeff stuff is not only used to implement swap anymore :)
Add an internal pseudo expression allowing to optimize operators like +=, *= using
the copyCoeff stuff.
This allows to easily enforce aligned load for the destination matrix everywhere.
2009-11-20 15:39:38 +01:00
Benoit Jacob
92749eed11 * merge
* remove a ctor in QuaternionBase as it gives a strange error with GCC 4.4.2.
2009-11-09 09:08:03 -05:00
Hauke Heibel
3979f6d8aa Let's try to stick to the original code, thus activate the fix of #62 only for 64 bit builds. 2009-11-04 15:49:22 +01:00
Hauke Heibel
e2170b9f7e Direct access of the packet structs fixes bug #62 and doe not seem to
influence compiler optimization.
2009-11-04 15:38:11 +01:00
Benoit Jacob
d41577819b we were already aligning to 16 byte boundary fixed-size objects that are multiple of 16 bytes;
now we also align to 8byte boundary fixed-size objects that are multiple of 8 bytes.
That's only useful for now for double, not e.g. for Vector2f, but that didn't seem to hurt. Am I missing something? Do you prefer that we don't align Vector2f at all?
Also, improvements in test_unalignedassert.
2009-10-05 10:11:11 -04:00
Gael Guennebaud
5ba7fe3bee clean the commented asm instructions because now I'm sure
the previous fix is ok
2009-09-17 23:34:00 +02:00
Gael Guennebaud
9395326e44 fix #53: performance regression, hopefully I did not resurected another
perf. issue...
2009-09-17 23:18:21 +02:00
Gael Guennebaud
ef55e7f4ce make custom asm directive volatile 2009-08-09 23:09:46 +02:00
Gael Guennebaud
d1dc088ef0 * implement a second level of micro blocking (faster for small sizes)
* workaround GCC bad implementation of _mm_set1_p*
2009-08-07 11:09:34 +02:00
Gael Guennebaud
1a1b2e9f27 finally directly calling the low-level products is faster 2009-07-10 10:41:26 +02:00
Benoit Jacob
fc9000f23e only disable the inline ASM if we're NEITHER gcc nor icc. right ?? 2009-06-26 05:32:21 +02:00
Gael Guennebaud
a44f7cf440 re-enable the fast unaligned loads for gcc and icc using inline assembly
(this allows to avoid incompatible pointer casts and to specify the dependency to the data explicitely)
2009-06-24 10:48:36 +02:00
Gael Guennebaud
aa17b5b514 use the slower unaligned load intrinsics in ei_ploadu because GCC mess up with my tricks 2009-06-23 23:28:34 +02:00
Benoit Jacob
6347b1db5b remove sentence "Eigen itself is part of the KDE project."
it never made very precise sense. but now does it still make any?
2009-05-22 20:25:33 +02:00
Gael Guennebaud
1e286464ab * compilation fixes for gcc 3.3
* test Part::swap
2009-05-06 08:43:38 +00:00
Benoit Jacob
b60571a193 fix warnings with unused static functions 2009-05-04 12:49:56 +00:00
Gael Guennebaud
c7bb7436f9 make the ei_p* math functions overloads instead of template
specializations
2009-04-22 21:35:50 +00:00
Benoit Jacob
0c99de5a17 more patches from Hauke Heibel: compilation/warning fixes from VC++ 2009-04-09 17:19:17 +00:00
Gael Guennebaud
e8329f9f45 relicence Julien Pommier's SSE code to Eigen's licenses 2009-04-09 06:03:51 +00:00
Benoit Jacob
502bf4a81d * fix the binary bloat issue, Rohit's idea was the good one
* a few dox fixes (alloc routines do return 0 on error) and forgot to update version number in CMakeLists
2009-04-06 13:33:42 +00:00
Gael Guennebaud
49fc1e3e84 add vectorization of sqrt for float 2009-03-27 14:41:46 +00:00
Gael Guennebaud
a22ef7e1f3 for some reason passing the argument by const reference killed the perf
(in the packet version of sin, cos, exp, lop), so let's pass them by
value. Also, improve the perf of ei_plog by reducing dependencies.
2009-03-25 18:33:36 +00:00
Gael Guennebaud
17860e578c add SSE2 versions of sin, cos, log, exp using code from Julien
Pommier. They are for float only, and they return exactly the same
result as the standard versions in about 90% of the cases. Otherwise the max error
is below 1e-7. However, for very large values (>1e3) the accuracy of sin and cos
slighlty decrease. They are about 3 or 4 times faster than 4 calls to their respective
standard versions. So, is it ok to enable them by default in their respective functors ?
2009-03-25 12:26:13 +00:00
Konstantinos A. Margaritis
fe00e864a1 ei_pnegate implemented for AltiVec 2009-03-20 17:26:50 +00:00
Gael Guennebaud
fbf415c547 add vectorization of unary operator-() (the AltiVec version is probably
broken)
2009-03-20 10:03:24 +00:00
Gael Guennebaud
3f80c68be5 add the vectorization of abs 2009-03-09 18:40:09 +00:00
Gael Guennebaud
7718a8ed83 slight optimization of SSE base integer mul (thanks to Rohit Garg) 2009-03-08 10:14:07 +00:00
Gael Guennebaud
3288e9e168 add much faster versions of unaligned stores (and slightly faster
unaligned loads)
2009-03-03 14:01:30 +00:00
Laurent Montel
2d6d14a3d3 Add COMPONENT Devel 2009-02-23 07:50:56 +00:00
Konstantinos A. Margaritis
349557db9a no reason for 3 vec_mins, 2 are enough apparently in ei_predux_min 2009-02-12 22:03:30 +00:00
Konstantinos A. Margaritis
ad2bf14dbb modified ei_predux_min/max to actually use altivec instructions 2009-02-12 21:58:44 +00:00
Gael Guennebaud
51c991af45 * exit Sum.h, exit Prod.h, welcome vectorization of redux() !
* add vectorization for minCoeff and maxCoeff
2009-02-12 15:18:59 +00:00
Gael Guennebaud
7954f7709a add ei_predux_mul for AltiVec 2009-02-10 18:26:59 +00:00
Gael Guennebaud
cbbc6d940b * add ei_predux_mul internal function
* apply Ricard Marxer's prod() patch with fixes for the vectorized path
2009-02-10 18:06:05 +00:00
Konstantinos A. Margaritis
15e40b1099 fixed preserve_mask definition for AltiVec (needed __vector keyword) 2009-02-08 18:43:57 +00:00
Gael Guennebaud
cc90495e30 add bench_reverse, draft of a reverse vectorization for AltiVec, make
global Scaling function static
2009-02-06 13:28:55 +00:00
Gael Guennebaud
f5d96df800 Add vectorization of Reverse (was more tricky than I thought) and
simplify the index based functions
2009-02-06 12:40:38 +00:00
Gael Guennebaud
13d0a310fd fix MSVC internal compilation error 2009-01-29 22:49:24 +00:00
Benoit Jacob
9e3c73110a fix a bunch of warnings (actual issues) reported by Frank 2009-01-22 00:09:34 +00:00
Gael Guennebaud
5f6fbaa0e7 * fix a vectorization issue in Product
* use _mm_malloc/_mm_free on other platforms than linux of MSVC (eg., cygwin, OSX)
* replace a lot of inline keywords by EIGEN_STRONG_INLINE to compensate for
  poor MSVC inlining
2008-12-19 15:38:39 +00:00
Benoit Jacob
50105c3ed6 Hopefully fix compilation of SSE Packetmath with MSVC.
The reason why we didn't realize until now that it didn't compile at all
with MSVC is that before today with MSVC the SSE2 detection didn't work.
2008-12-16 03:48:49 +00:00
Benoit Jacob
f7de12de69 Missing inline keywords in AltiVec/PacketMath were making Avogadro fail
to compile (duplicate symbols).
2008-08-27 20:06:15 +00:00
Benoit Jacob
a0cfe6ebdc remove double ; 2008-08-27 02:58:04 +00:00
Benoit Jacob
12c6b45ae5 replace vector by __vector to prevent conflict with std::vector 2008-08-26 23:25:10 +00:00