eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-07-12 08:01:49 +08:00

Author	SHA1	Message	Date
Gael Guennebaud	772e59d475	bug #1360 : fix sign issue with pmull on altivec (grafted from 8c0e70150433e8fe50c980ff629a9f80162eaf92 )	2016-12-18 22:13:19 +00:00
Konstantinos Margaritis	9f7caa7e7d	minor fixes for big endian altivec/vsx	2016-07-10 07:05:10 -03:00
Konstantinos Margaritis	be107e387b	fix compilation with clang 3.9, fix performance with pset1, use vector operators instead of intrinsics in some cases	2016-06-23 10:19:05 -03:00
Konstantinos Margaritis	b410d46482	mostly cleanups and modernizing code	2016-06-19 16:12:52 -03:00
Konstantinos Margaritis	8ed26120c8	bring Altivec/VSX to a better state, implement some of the missing functions	2016-04-28 14:32:42 -03:00
Doug Kwan	5c9ee73eb9	Implement plog and pexp for AltiVec.	2015-07-30 11:12:42 -07:00
Gael Guennebaud	6245591349	Fix prototype of plset and generalize linspace functor.	2015-08-07 19:27:59 +02:00
Gael Guennebaud	ce57dbd937	Let unpacket_traits<> exposes the required alignment and make use of it everywhere	2015-08-07 10:44:01 +02:00
Gael Guennebaud	45cbb0bbb1	The usage of DenseIndex is deprecated, so let's replace DenseIndex by Index	2015-02-16 15:05:41 +01:00
Benoit Jacob	0f21613698	bug #936 , patch 2/3: Remove EIGEN_VECTORIZE_FMA, was redundant with EIGEN_HAS_SINGLE_INSTRUCTION_MADD	2015-01-30 17:44:26 -05:00
Benoit Jacob	340b8afb14	bug #936 , patch 1.5/3: rename _FUSED_ macros to _SINGLE_INSTRUCTION_, because this is what they are about. "Fused" means "no intermediate rounding between the mul and the add, only one rounding at the end". Instead, what we are concerned about here is whether a temporary register is needed, i.e. whether the MUL and ADD are separate instructions. Concretely, on ARM NEON, a single-instruction mul-add is always available: VMLA. But a true fused mul-add is only available on VFPv4: VFMA.	2015-01-31 14:15:57 -05:00
Benoit Jacob	9f99f61e69	bug #936 , patch 1/3: some cleanup and renaming for consistency.	2015-01-30 17:43:56 -05:00
Konstantinos Margaritis	9d3c69952b	fixed to make big-endian VSX work as well	2014-10-01 09:43:56 +00:00
Konstantinos Margaritis	de38ff2499	prefetch are noops on VSX, actually disable the prefetch trait	2014-09-21 11:56:07 +00:00
Konstantinos Margaritis	56408504e4	fix compile error on big endian altivec	2014-09-21 13:59:30 +03:00
Konstantinos Margaritis	974fe38ca3	prefetch are noops on VSX	2014-09-21 11:24:30 +00:00
Konstantinos Margaritis	c0205ca4af	VSX supports vec_div, implement where appropriate (float/doubles)	2014-09-21 08:12:22 +00:00
Konstantinos Margaritis	10f8aabb61	VSX port passes packetmath_[1-5] tests!	2014-09-20 22:31:31 +00:00
Konstantinos Margaritis	60663a510a	32-bit floats/ints, 64-bit doubles pass packetmath tests, complex 32/64-bit remaining	2014-09-19 21:05:01 +00:00
Konstantinos Margaritis	470aa15c35	First time it compiles, but fails to pass the tests.	2014-09-09 16:58:48 +00:00
Konstantinos Margaritis	7ff266e3ce	Initial VSX commit	2014-08-29 20:03:49 +00:00
Konstantinos Margaritis	0a945687b7	Added HasDiv=1 to Altivec PacketMath.h, now vectorization_logic test passes. Added comments to the constants, indicative of the actual values	2014-07-15 11:02:51 +00:00
Gael Guennebaud	b47ef1431f	Fix many long to int implicit conversions	2014-07-08 16:47:11 +02:00
Gael Guennebaud	2dbfd83424	Implement pbroadcast4 on altivec	2014-04-25 02:46:57 -07:00
Gael Guennebaud	3d8d0f6269	Enable vectorization of pack_rhs with a column-major RHS. Rename and generalize Kernel<> to PacketBlock<,N>.	2014-04-25 10:56:18 +02:00
Gael Guennebaud	b0e19db1cf	Enable fused madd for Altivec	2014-04-24 23:17:18 +02:00
Gael Guennebaud	8d85ce88e1	Implement ptranspose on altivec and fix pgather/pscatter	2014-04-24 05:47:53 -07:00
Gael Guennebaud	82b09fcb91	Add Altivec implementation of pgather/pscatter (not tested)	2014-04-23 13:09:26 +02:00
Gael Guennebaud	d5a795f673	New gebp kernel handling up to 3 packets x 4 register-level blocks. Huge speeup on Haswell. This changeset also introduce new vector functions: ploadquad and predux4.	2014-04-16 17:05:11 +02:00
Gael Guennebaud	10aa14592a	Add a mechanism to recursively access to half-size packet types	2014-03-28 10:18:04 +01:00
Gael Guennebaud	4612a1cd87	Fix ploaddup and lin-spaced with AltiVec.	2013-09-10 16:13:59 +02:00
Gael Guennebaud	b3adc4face	Add missing pconj specializations	2013-05-17 17:25:29 +02:00
Benoit Jacob	69124cfca2	Automatic relicensing to MPL2 using Keirs script. Manual fixup follows.	2012-07-13 14:42:47 -04:00
Jitse Niesen	3c412183b2	Get rid of include directives inside namespace blocks (bug #339 ).	2012-04-15 11:06:28 +01:00
Gael Guennebaud	9c86ee2695	fix static inline versus inline static issues (the former is the correct order)	2012-01-31 12:58:52 +01:00
Thomas Capricelli	883219041f	better fix for gcc 4.6.0 / ptrdiff_t, as suggested by Benoit	2011-05-05 18:48:18 +02:00
Thomas Capricelli	a18a1be42d	Fix compilation with gcc-4.6.0, patch provided by Anton Gladky <gladky.anton@gmail.com>, working on debian packaging.	2011-05-05 00:44:24 +02:00
Gael Guennebaud	bb9a465c5a	fix AltiVec ploaddup	2011-02-24 00:23:50 +03:00
Gael Guennebaud	955c099eb5	implement ploaddup for altivec and add respective unit test	2011-02-23 18:20:55 +03:00
Jitse Niesen	e2d46eac42	Remove all references to EIGEN_TUNE_CPU_CACHE_SIZE. This macro is no longer used as of revision 0212eec23f4cb64e8426bf32568156df302f8fcf .	2011-02-04 22:33:53 +01:00
Benoit Jacob	4716040703	bug #86 : use internal:: namespace instead of ei_ prefix	2010-10-25 10:15:22 -04:00
Gael Guennebaud	ff96c94043	mixing types in product step 2: * pload* and pset1 are now templated on the packet type * gemv routines are now embeded into a structure with a consistent API with respect to gemm * some configurations of vector * matrix and matrix * matrix works fine, some need more work...	2010-07-11 15:48:30 +02:00
Gael Guennebaud	4161b8be67	sync	2010-07-10 22:58:51 +02:00
Konstantinos Margaritis	642cc27eb1	forgot to commit ei_p4f_FORWARD;	2010-07-09 18:08:18 +03:00
Gael Guennebaud	300a226ffa	scalars fitting in a single packet requires more work, step 1 * add a, Alignable trait * update LinearVectorization assignment	2010-07-08 14:27:47 +02:00
Gael Guennebaud	b0896382a3	s/IsVectorized/Vectorizable	2010-07-07 11:10:46 +02:00
Gael Guennebaud	bfa606d16f	* add a IsVectorized mechanism (instead of packet-size>1...) * vectorize complex<double>	2010-07-06 23:36:00 +02:00
Konstantinos Margaritis	cf3616b2c0	AltiVec signed integer pmadd removed, proved to be 2x slower than the scalar trait(!).	2010-06-28 21:24:55 +03:00
Gael Guennebaud	88cd6885be	Add a proof concept API to configure the blocking parameters at runtime. After validation of the final API I'll update the other products to use it.	2010-06-07 16:35:25 +02:00
Konstantinos Margaritis	9337f371d2	(proper commit this time) replaced _mm_prefetch in GeneralBlockPanelKernel.h, with ei_prefetch() inline function. Implemented NEON and AltiVec versions, copied SSE version over from GeneralBlockPanelKernel.h. Also in GCC case (or rather !_MSC_VER) it's implemented using __builtin_prefetch(). NEON managed to give a small but welcome boost, 0.88GFLOPS -> 0.91GFLOPS.	2010-04-24 00:58:44 +03:00

1 2

76 Commits