Gael Guennebaud 01fd880424 Revert previous change and introduce a new workaround regarding gcc generating a shufps instruction instead of the more efficient pshufd instruction.
The trick consists in introducing a new pload1 function to be used in low level product kernels for which bug #203 does not apply.
Indeed, it turned out that using inline assembly prevents gcc of doing a good job at instructtion reordering.
2014-03-20 16:03:46 +01:00
2014-02-13 09:21:13 +01:00
2014-02-13 09:21:13 +01:00
2011-02-06 11:55:51 -05:00
2014-02-13 09:21:13 +01:00
2011-01-31 09:21:31 -05:00
2014-01-05 14:24:41 +01:00
2011-12-05 14:52:21 +07:00
2012-07-15 11:46:22 -04:00
2012-07-15 10:20:59 -04:00
2009-11-19 12:09:04 -05:00
Description
No description provided
MPL-2.0 147 MiB
Languages
C++ 85.1%
Fortran 8.5%
C 2.8%
CMake 1.9%
Cuda 1.2%
Other 0.4%