Gael Guennebaud 01fd880424 Revert previous change and introduce a new workaround regarding gcc generating a shufps instruction instead of the more efficient pshufd instruction.
The trick consists in introducing a new pload1 function to be used in low level product kernels for which bug #203 does not apply.
Indeed, it turned out that using inline assembly prevents gcc of doing a good job at instructtion reordering.
2014-03-20 16:03:46 +01:00
..
2013-11-07 16:38:14 +01:00
2010-07-23 19:00:02 +02:00
QR
2013-12-31 18:06:28 +00:00
2011-02-22 09:31:22 -05:00
2013-07-02 14:08:12 +01:00
2013-05-29 10:15:40 +02:00
2013-05-29 10:15:40 +02:00