eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-09-13 18:03:13 +08:00

Author	SHA1	Message	Date
Gael Guennebaud	d936ddc3d1	Fallback to lazy products for very small ones.	2014-04-16 23:15:42 +02:00
Gael Guennebaud	de8336a9bc	Enable alloca on MAC OSX	2014-04-16 23:14:58 +02:00
Jitse Niesen	ffc995c9e4	Implement evaluator<ReturnByValue>. All supported tests pass apart from Sparse and Geometry, except test in adjoint_4 that a = a.transpose() raises an assert.	2014-04-16 18:16:36 +01:00
Gael Guennebaud	d5a795f673	New gebp kernel handling up to 3 packets x 4 register-level blocks. Huge speeup on Haswell. This changeset also introduce new vector functions: ploadquad and predux4.	2014-04-16 17:05:11 +02:00
Jitse Niesen	b30706bd5c	Fix typo in Inverse.h	2014-04-15 22:51:46 +01:00
Mark Borgerding	e0dbb68c2f	Check IMKL version for compatibility with Eigen	2014-04-15 13:57:03 -04:00
Jitse Niesen	59f5f155c2	Port products with permutation matrices to evaluators.	2014-04-15 15:21:38 +01:00
Gael Guennebaud	3c66bb136b	bug #793 : detect NaN and INF in EigenSolver instead of aborting with an assert.	2014-04-14 22:00:27 +02:00
Gael Guennebaud	7098e6d976	Add isfinite overload for complexes.	2014-04-14 21:57:49 +02:00
Benoit Steiner	feaf7c7e6d	Optimized SSE unaligned loads and stores when compiling a 64bit target with a recent version of gcc (ie gcc 4.8).	2014-04-14 10:44:17 -07:00
Gael Guennebaud	148acf8e4f	bug #790 : fix overflow in real_2x2_jacobi_svd	2014-04-14 13:52:16 +02:00
Gael Guennebaud	0587db8bf5	bug #793 : fix overflow in EigenSolver and add respective regression unit test	2014-04-14 11:43:08 +02:00
Benoit Steiner	1b333c89c9	Updated my previous fix to avoid introducing a compilation warning on ARM platforms.	2014-04-10 17:43:13 -07:00
Benoit Steiner	a1fcf599fa	Silenced a compilation warning produced by nvcc.	2014-04-10 11:19:37 -07:00
Jitse Niesen	a91a7a1964	doc: Add references to Cholesky methods in SelfAdjointView.	2014-04-07 14:14:48 +01:00
Benoit Steiner	b446ff037e	Deleted some dead code.	2014-04-04 14:12:24 -07:00
Christoph Hertzberg	096af59799	Fix bug #784 : Assert if assigning a product to a triangularView does not match the size.	2014-04-04 17:48:37 +02:00
Benoit Steiner	8044b00a7f	bug #782 : Workaround for gcc <= 4.4 compilation error on the NEON PacketMath code.	2014-04-03 23:41:47 +02:00
Gael Guennebaud	8d0441052e	Finally, prefetching seems to help getting more stable performance	2014-03-31 10:42:19 +02:00
Gael Guennebaud	1c0728043a	Workaround alignment warnings	2014-03-30 22:43:47 +02:00
Gael Guennebaud	e497a27ddc	Optimize gebp kernel: 1 - increase peeling level along the depth dimention (+5% for large matrices, i.e., >1000) 2 - improve pipelining when dealing with latest rows of the lhs	2014-03-30 21:57:05 +02:00
Benoit Steiner	ad59ade116	Vectorized the loop peeling of the inner loop of the block-panel matrix multiplication code. This speeds up the multiplication of matrices which size is not a multiple of the packet size.	2014-03-28 12:11:23 -07:00
Gael Guennebaud	10aa14592a	Add a mechanism to recursively access to half-size packet types	2014-03-28 10:18:04 +01:00
Gael Guennebaud	8d2bb2c20d	merge with default branch	2014-03-28 09:24:18 +01:00
Gael Guennebaud	c94fde118a	Enable vectorization of gemv for PacketSize>4 through unaligned loads (still better than no vectorization)	2014-03-28 09:11:06 +01:00
Benoit Steiner	51e85c936d	Merged latest changes from parent.	2014-03-27 18:32:15 -07:00
Benoit Steiner	8a94cb3edd	Implemented the SSE version of the gather and scatter packet primitives.	2014-03-27 18:29:01 -07:00
Benoit Steiner	7f3162f707	Implemented the AVX version of the gather and scatter packet primitives.	2014-03-27 17:42:25 -07:00
Benoit Steiner	ee86679096	Introduced pscatter/pgather packet primitives. They will be used to optimize the loop peeling code of the block-panel matrix multiplication kernel.	2014-03-27 16:03:03 -07:00
Gael Guennebaud	58fe2fc2b2	enforce the use of vfmadd231ps for pmadd (gcc and clang stupidely generates the other fmadd variants plus some register moves...)	2014-03-27 23:38:50 +01:00
Benoit Steiner	729363114f	Fixed compilation error when FMA instructions are enabled.	2014-03-27 11:20:41 -07:00
Benoit Steiner	1697d7a179	Silenced "unused variable" warnings when compiling with FMA.	2014-03-27 11:00:47 -07:00
Benoit Steiner	3e1fe8e416	Vectorized the packing of a col-major matrix used as the right hand side argument in a matrix-matrix product when AVX instructions are used. No vectorization takes place when SSE instructions are used, however this doesn't seem to impact performance.	2014-03-27 10:38:41 -07:00
Benoit Steiner	b776458ccb	Vectorized the packing of a row-major matrix used as the left hand side argument in a matrix-matrix product.	2014-03-27 10:02:24 -07:00
Benoit Steiner	c4902a3d01	Implemented the AVX version of the ptranspose packet primitive.	2014-03-27 09:34:51 -07:00
Gael Guennebaud	052aedd394	Implement pcplflip, palign, predux and the likes from AVC/complexes	2014-03-27 14:47:00 +01:00
Gael Guennebaud	fb03b56647	Fix warning	2014-03-27 11:38:35 +01:00
Benoit Steiner	a419cea4a0	Created the ptranspose packet primitive that can transpose an array of N packets, where N is the number of words in each packet. This primitive will be used to complete the vectorization of the gemm_pack_lhs and gemm_pack_rhs functions. Implemented the primitive using SSE instructions.	2014-03-26 19:03:07 -07:00
Benoit Steiner	14bc4b9704	Made sure that the version of gemm_pack_rhs specialized for row major matrices is vectorized when nr == 2*PacketSize (which is the case for SSE when compiling in 64bit mode).	2014-03-26 17:35:18 -07:00
Benoit Steiner	e45a6bed45	Specialized the pload1 packet primitive for Packet8f and Packet4d in order to take advantage of the vbroadcastss and vbroadcastsd instructions whenever possible.	2014-03-26 15:58:13 -07:00
Benoit Steiner	cc73164aa8	Merged latest updates from the parent branch	2014-03-26 15:23:59 -07:00
Gael Guennebaud	f0a4c9d5ab	Update gebp kernel to process a panle of 4 columns at once for the remaining ones.	2014-03-26 23:22:36 +01:00
Gael Guennebaud	8be011e776	Remove remaining bits of the dead working buffer	2014-03-26 23:14:44 +01:00
Benoit Steiner	a078f442a3	Vectorized the multiplication and division of complex numbers using AVX instructions.	2014-03-26 15:11:18 -07:00
Benoit Steiner	cf1a7bfbe1	Used AVX instructions to vectorize the complex version of the pfirst and ploaddup packet primitives. Silenced a few compilation warnings.	2014-03-26 12:03:31 -07:00
Gael Guennebaud	bc401eb6fa	Implement new 1 packet x 8 gebp kernel	2014-03-26 18:53:00 +01:00
Gael Guennebaud	b286a1e75c	add pbroadcast2/4 generic intrinsics	2014-03-26 16:46:36 +01:00
Benoit Steiner	6bf3cc2732	Use AVX instructions to vectorize pset1<Packet2cd>, pset1<Packet4cf>, preverse<Packet2cd>, and preverse<Packet4cf>	2014-03-25 09:00:43 -07:00
Benoit Steiner	7ae9b0805d	Used AVX instructions to vectorize the predux_min<Packet8f>, predux_min<Packet4d>, predux_max<Packet8f>, and predux_max<Packet4d> packet primitives.	2014-03-24 13:33:40 -07:00
Benoit Steiner	72707a8664	Made sure that EIGEN_ALIGN is defined when EIGEN_DONT_VECTORIZE is set to true to prevent build failures when vectorization is disabled.	2014-03-21 11:40:29 -07:00

... 80 81 82 83 84 ...

7291 Commits