Gael Guennebaud 3e4a68cc60 optimize vectorized reductions by peeling the loop:
- x2 for squaredNorm() on double
 - peeling the loop with a peeling factor of 4 leads to even better perf
   for large vectors (e.g., >64) but it makes more difficult to keep good performance on smaller ones.
2011-11-12 09:19:48 +01:00
2011-02-06 11:55:51 -05:00
2011-01-31 09:21:31 -05:00
2009-11-19 12:09:04 -05:00
Description
No description provided
MPL-2.0 148 MiB
Languages
C++ 85.1%
Fortran 8.5%
C 2.8%
CMake 1.9%
Cuda 1.2%
Other 0.4%