eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-10-21 20:41:06 +08:00

Author	SHA1	Message	Date
Thomas Capricelli	883219041f	better fix for gcc 4.6.0 / ptrdiff_t, as suggested by Benoit	2011-05-05 18:48:18 +02:00
Thomas Capricelli	a18a1be42d	Fix compilation with gcc-4.6.0, patch provided by Anton Gladky <gladky.anton@gmail.com>, working on debian packaging.	2011-05-05 00:44:24 +02:00
Gael Guennebaud	bb9a465c5a	fix AltiVec ploaddup	2011-02-24 00:23:50 +03:00
Gael Guennebaud	23aae0d63e	fix pset1 for complex	2011-02-23 21:24:47 +03:00
Gael Guennebaud	955c099eb5	implement ploaddup for altivec and add respective unit test	2011-02-23 18:20:55 +03:00
Gael Guennebaud	6e01780541	fix a couple of issues with pcplxflip	2011-02-23 17:51:40 +03:00
Gael Guennebaud	78e1a62c54	implement pcplxflip for altivec	2011-02-23 14:20:58 +01:00
Gael Guennebaud	32e7dae776	Altivec: fix infinite loop (ei_ -> internal:: change)	2011-02-23 09:41:02 +01:00
Gael Guennebaud	2fb5567e08	add missing AlignedOnScalar	2011-02-22 21:25:47 +01:00
Gael Guennebaud	39b27fb656	altivec compilation fix	2011-02-22 15:26:28 +01:00
Gael Guennebaud	51da67f211	more compilation fixes for altivec	2011-02-21 20:36:20 +01:00
Gael Guennebaud	05545d0197	fix compilation	2011-02-21 17:47:31 +01:00
Jitse Niesen	e2d46eac42	Remove all references to EIGEN_TUNE_CPU_CACHE_SIZE. This macro is no longer used as of revision 0212eec23f4cb64e8426bf32568156df302f8fcf .	2011-02-04 22:33:53 +01:00
Benoit Jacob	4716040703	bug #86 : use internal:: namespace instead of ei_ prefix	2010-10-25 10:15:22 -04:00
Gael Guennebaud	ff96c94043	mixing types in product step 2: * pload* and pset1 are now templated on the packet type * gemv routines are now embeded into a structure with a consistent API with respect to gemm * some configurations of vector * matrix and matrix * matrix works fine, some need more work...	2010-07-11 15:48:30 +02:00
Gael Guennebaud	4161b8be67	sync	2010-07-10 22:58:51 +02:00
Konstantinos Margaritis	6ad3f1ab1f	Added NEON/Complex.h, ~3.5x faster than scalar std::complex<float> minor fix in AltiVec Complex.h	2010-07-10 00:09:29 +03:00
Konstantinos Margaritis	642cc27eb1	forgot to commit ei_p4f_FORWARD;	2010-07-09 18:08:18 +03:00
Konstantinos Margaritis	d9e134c73c	Altivec port of Complex.h. Note: For some reason g++ 4.4 is >200% slower than g++ 4.3 on altivec code. The same benchmark (bench_gemm) was tested, on the same hardware/OS (G4/Debian testing), with same CFLAGS. With some code reorganizing I managed to get some minor gain on 4.4, but I just could not reach 4.3 speed. This is most likely a bug, but I'm waiting to see if it's fixed on 4.5. I'll look into this a bit more.	2010-07-09 17:54:41 +03:00
Gael Guennebaud	300a226ffa	scalars fitting in a single packet requires more work, step 1 * add a, Alignable trait * update LinearVectorization assignment	2010-07-08 14:27:47 +02:00
Gael Guennebaud	b0896382a3	s/IsVectorized/Vectorizable	2010-07-07 11:10:46 +02:00
Gael Guennebaud	bfa606d16f	* add a IsVectorized mechanism (instead of packet-size>1...) * vectorize complex<double>	2010-07-06 23:36:00 +02:00
Konstantinos Margaritis	cf3616b2c0	AltiVec signed integer pmadd removed, proved to be 2x slower than the scalar trait(!).	2010-06-28 21:24:55 +03:00
Gael Guennebaud	88cd6885be	Add a proof concept API to configure the blocking parameters at runtime. After validation of the final API I'll update the other products to use it.	2010-06-07 16:35:25 +02:00
Konstantinos Margaritis	9337f371d2	(proper commit this time) replaced _mm_prefetch in GeneralBlockPanelKernel.h, with ei_prefetch() inline function. Implemented NEON and AltiVec versions, copied SSE version over from GeneralBlockPanelKernel.h. Also in GCC case (or rather !_MSC_VER) it's implemented using __builtin_prefetch(). NEON managed to give a small but welcome boost, 0.88GFLOPS -> 0.91GFLOPS.	2010-04-24 00:58:44 +03:00
Konstantinos Margaritis	5acf46bd12	Backed out changeset 6972c140f737874d88da0e225c7c27b4563a4518	2010-04-24 00:57:10 +03:00
oem	6972c140f7	replaced _mm_prefetch in GeneralBlockPanelKernel.h, with ei_prefetch() inline function. Implemented NEON and AltiVec versions, copied SSE version over from GeneralBlockPanelKernel.h. Also in GCC case (or rather !_MSC_VER) it's implemented using __builtin_prefetch(). NEON managed to give a small but welcome boost, 0.88GFLOPS -> 0.91GFLOPS.	2010-04-24 00:44:14 +03:00
Gael Guennebaud	afd7ee759b	fix copy pasted comment	2010-03-05 21:35:11 +01:00
Konstantinos Margaritis	273b236f72	Altivec brought up to date. Most tests pass and performance is better than before too!	2010-03-05 22:28:49 +02:00
Konstantinos Margaritis	112c550b4a	Added initial NEON support, most tests pass however we had to use some hackish workarounds as gcc on ARM (both CodeSourcery 4.4.1 used and experimental 4.5) fail to ensure proper alignment with __attribute__((aligned(16))). This has to be fixed upstream to remove the workarounds.	2010-03-03 11:25:41 -06:00
Benoit Jacob	d41577819b	we were already aligning to 16 byte boundary fixed-size objects that are multiple of 16 bytes; now we also align to 8byte boundary fixed-size objects that are multiple of 8 bytes. That's only useful for now for double, not e.g. for Vector2f, but that didn't seem to hurt. Am I missing something? Do you prefer that we don't align Vector2f at all? Also, improvements in test_unalignedassert.	2009-10-05 10:11:11 -04:00
Benoit Jacob	6347b1db5b	remove sentence "Eigen itself is part of the KDE project." it never made very precise sense. but now does it still make any?	2009-05-22 20:25:33 +02:00
Gael Guennebaud	17860e578c	add SSE2 versions of sin, cos, log, exp using code from Julien Pommier. They are for float only, and they return exactly the same result as the standard versions in about 90% of the cases. Otherwise the max error is below 1e-7. However, for very large values (>1e3) the accuracy of sin and cos slighlty decrease. They are about 3 or 4 times faster than 4 calls to their respective standard versions. So, is it ok to enable them by default in their respective functors ?	2009-03-25 12:26:13 +00:00
Konstantinos A. Margaritis	fe00e864a1	ei_pnegate implemented for AltiVec	2009-03-20 17:26:50 +00:00
Gael Guennebaud	fbf415c547	add vectorization of unary operator-() (the AltiVec version is probably broken)	2009-03-20 10:03:24 +00:00
Gael Guennebaud	3f80c68be5	add the vectorization of abs	2009-03-09 18:40:09 +00:00
Laurent Montel	2d6d14a3d3	Add COMPONENT Devel	2009-02-23 07:50:56 +00:00
Konstantinos A. Margaritis	349557db9a	no reason for 3 vec_mins, 2 are enough apparently in ei_predux_min	2009-02-12 22:03:30 +00:00
Konstantinos A. Margaritis	ad2bf14dbb	modified ei_predux_min/max to actually use altivec instructions	2009-02-12 21:58:44 +00:00
Gael Guennebaud	51c991af45	* exit Sum.h, exit Prod.h, welcome vectorization of redux() ! * add vectorization for minCoeff and maxCoeff	2009-02-12 15:18:59 +00:00
Gael Guennebaud	7954f7709a	add ei_predux_mul for AltiVec	2009-02-10 18:26:59 +00:00
Konstantinos A. Margaritis	15e40b1099	fixed preserve_mask definition for AltiVec (needed __vector keyword)	2009-02-08 18:43:57 +00:00
Gael Guennebaud	cc90495e30	add bench_reverse, draft of a reverse vectorization for AltiVec, make global Scaling function static	2009-02-06 13:28:55 +00:00
Benoit Jacob	f7de12de69	Missing inline keywords in AltiVec/PacketMath were making Avogadro fail to compile (duplicate symbols).	2008-08-27 20:06:15 +00:00
Benoit Jacob	a0cfe6ebdc	remove double ;	2008-08-27 02:58:04 +00:00
Benoit Jacob	12c6b45ae5	replace vector by __vector to prevent conflict with std::vector	2008-08-26 23:25:10 +00:00
Gael Guennebaud	8f9d30cb20	* patch from Konstantinos Margaritis: bugfix in Altivec version of ei_pdiv and various cleaning in Altivec code. Altivec vectorization have been re-enabled in CoreDeclaration * added copy constructors in non empty functors because I observed weird behavior with std::complex<>	2008-08-25 16:22:56 +00:00
Gael Guennebaud	a95c1e190b	patch from Konstantinos Margaritis: Altivec vectorization is resurrected !	2008-08-22 13:19:35 +00:00
Benoit Jacob	54137f1ca7	* fix bug found by Boudewijn Rempt: no CMakeLists in arch/ subdir * fix warning in SolveTriangular	2008-08-19 13:15:13 +00:00
Benoit Jacob	e27b2b95cf	* rework Map, allow vectorization * rework PacketMath and DummyPacketMath, make these actual template specializations instead of just overriding by non-template inline functions * introduce ei_ploadt and ei_pstoret, make use of them in Map and Matrix * remove Matrix::map() methods, use Map constructors instead.	2008-06-27 01:22:35 +00:00

1 2

53 Commits