19 Commits

Author SHA1 Message Date
Gael Guennebaud
c5c8efa575 workaround gcc 4.2 and 4.3 compilation issue with NEON 2011-02-07 16:41:21 +01:00
Jitse Niesen
e2d46eac42 Remove all references to EIGEN_TUNE_CPU_CACHE_SIZE.
This macro is no longer used as of revision 0212eec23f4cb64e8426bf32568156df302f8fcf
.
2011-02-04 22:33:53 +01:00
Konstantinos Margaritis
e05c79cbd8 Fixed NEON compilation errors, changed float-abi back to softfp (which is the most used right now).
Some complex tests appear to segfault, needs a more careful look.
2010-12-10 20:27:46 +02:00
Benoit Jacob
4716040703 bug #86 : use internal:: namespace instead of ei_ prefix 2010-10-25 10:15:22 -04:00
Gael Guennebaud
ced1a45f82 add NEON ploaddup and pcplxflip functions 2010-07-20 14:24:01 +02:00
Gael Guennebaud
ff96c94043 mixing types in product step 2:
* pload* and pset1 are now templated on the packet type
* gemv routines are now embeded into a structure with
  a consistent API with respect to gemm
* some configurations of vector * matrix and matrix * matrix works fine,
  some need more work...
2010-07-11 15:48:30 +02:00
Gael Guennebaud
4161b8be67 sync 2010-07-10 22:58:51 +02:00
Konstantinos Margaritis
6ad3f1ab1f Added NEON/Complex.h, ~3.5x faster than scalar std::complex<float>
minor fix in AltiVec Complex.h
2010-07-10 00:09:29 +03:00
Gael Guennebaud
300a226ffa scalars fitting in a single packet requires more work, step 1
* add a, Alignable trait
* update LinearVectorization assignment
2010-07-08 14:27:47 +02:00
Gael Guennebaud
b0896382a3 s/IsVectorized/Vectorizable 2010-07-07 11:10:46 +02:00
Gael Guennebaud
bfa606d16f * add a IsVectorized mechanism (instead of packet-size>1...)
* vectorize complex<double>
2010-07-06 23:36:00 +02:00
Gael Guennebaud
28e64b0da3 email change 2010-06-24 23:21:58 +02:00
Gael Guennebaud
88cd6885be Add a proof concept API to configure the blocking parameters at runtime.
After validation of the final API I'll update the other products to use it.
2010-06-07 16:35:25 +02:00
Konstantinos Margaritis
9337f371d2 (proper commit this time)
replaced _mm_prefetch in GeneralBlockPanelKernel.h, with ei_prefetch() inline function.
Implemented NEON and AltiVec versions, copied SSE version over from GeneralBlockPanelKernel.h.
Also in GCC case (or rather !_MSC_VER) it's implemented using __builtin_prefetch().
NEON managed to give a small but welcome boost, 0.88GFLOPS -> 0.91GFLOPS.
2010-04-24 00:58:44 +03:00
Konstantinos Margaritis
5acf46bd12 Backed out changeset 6972c140f737874d88da0e225c7c27b4563a4518 2010-04-24 00:57:10 +03:00
oem
6972c140f7 replaced _mm_prefetch in GeneralBlockPanelKernel.h, with ei_prefetch() inline function.
Implemented NEON and AltiVec versions, copied SSE version over from GeneralBlockPanelKernel.h.
Also in GCC case (or rather !_MSC_VER) it's implemented using __builtin_prefetch().
NEON managed to give a small but welcome boost, 0.88GFLOPS -> 0.91GFLOPS.
2010-04-24 00:44:14 +03:00
Gael Guennebaud
ea8cad5151 make the number of registers easier to configure per architectures 2010-03-04 18:58:12 +01:00
Gael Guennebaud
8ed1ef4469 add a minor FIXME 2010-03-04 18:30:28 +01:00
Konstantinos Margaritis
112c550b4a Added initial NEON support, most tests pass however we had to use some hackish workarounds
as gcc on ARM (both CodeSourcery 4.4.1 used and experimental 4.5) fail to
ensure proper alignment with __attribute__((aligned(16))). This has to be
fixed upstream to remove the workarounds.
2010-03-03 11:25:41 -06:00