eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-05-05 02:04:07 +08:00

Author	SHA1	Message	Date
Benoit Steiner	6811e6cf49	Merged in srvasude/eigen/fix_cuda_exp (pull request PR-268) Fix expm1 CUDA implementation (do not shadow exp CUDA implementation).	2016-12-08 05:14:11 -08:00
Angelos Mantzaflaris	7694684992	Remove superfluous const's (can cause warnings on some Intel compilers) (grafted from e236d3443c79f38aa721d95e64c275abbb5df10f )	2016-12-07 00:37:48 +01:00
Gael Guennebaud	f2f9df8aa5	Remove MSVC warning 4127 - conditional expression is constant from the disabled list as we now have a local workaround.	2016-12-20 22:53:19 +01:00
Gael Guennebaud	2b3fc981b8	bug #1362 : workaround constant conditional warning produced by MSVC	2016-12-20 22:52:27 +01:00
Gael Guennebaud	94e8d8902f	Fix bug #1367 : compilation fix for gcc 4.1!	2016-12-20 22:17:01 +01:00
Gael Guennebaud	684cfc762d	Add transpose, adjoint, conjugate methods to SelfAdjointView (useful to write generic code)	2016-12-20 16:33:53 +01:00
Gael Guennebaud	11f55b2979	Optimize storage layout of Cwise* and PlainObjectBase evaluator to remove the functor or outer-stride if they are empty. For instance, sizeof("(A-B).cwiseAbs2()") with A,B Vector4f is now 16 bytes, instead of 48 before this optimization. In theory, evaluators should be completely optimized away by the compiler, but this might help in some cases.	2016-12-20 15:55:40 +01:00
Gael Guennebaud	5271474b15	Remove common "noncopyable" base class from evaluator_base to get a chance to get EBO (Empty Base Optimization) Note: we should probbaly get rid of this class and define a macro instead.	2016-12-20 15:51:30 +01:00
Gael Guennebaud	316673bbde	Clean-up usage of ExpressionTraits in all/any implementation.	2016-12-20 14:38:05 +01:00
Christoph Hertzberg	10c6bcdc2e	Add support for long indexes and for (real-valued) row-major matrices to CholmodSupport module	2016-12-19 14:07:42 +01:00
Gael Guennebaud	f5d644b415	Make sure that HyperPlane::transform manitains a unit normal vector in the Affine case.	2016-12-20 09:35:00 +01:00
Benoit Steiner	923acadfac	Fixed compilation errors with gcc6 when compiling the AVX512 intrinsics	2016-12-19 13:02:27 -08:00
Benoit Jacob	751e097c57	Use 32 registers on ARM64	2016-12-19 13:44:46 -05:00
Gael Guennebaud	eb621413c1	Revert vec/y to vec*(1/y) in row-major TRSM: - div is extremely costly - this is consistent with the column-major case - this is consistent with all other BLAS implementations	2016-12-06 15:04:50 +01:00
Gael Guennebaud	8365c2c941	Fix BLAS backend for symmetric rank K updates.	2016-12-06 14:47:09 +01:00
Srinivas Vasudevan	e6c8b5500c	Change comparisons to use Scalar instead of RealScalar.	2016-12-05 14:01:45 -08:00
Srinivas Vasudevan	f7d7c33a28	Fix expm1 CUDA implementation (do not shadow exp CUDA implementation).	2016-12-05 12:19:01 -08:00
Srinivas Vasudevan	09ee7f0c80	Fix small nit where I changed name of plog1p to pexpm1.	2016-12-02 15:30:12 -08:00
Srinivas Vasudevan	a0d3ac760f	Sync from Head.	2016-12-02 14:14:45 -08:00
Srinivas Vasudevan	218764ee1f	Added support for expm1 in Eigen.	2016-12-02 14:13:01 -08:00
Gael Guennebaud	66f65ccc36	Ease compiler job to generate clean and efficient code in mat*vec.	2016-12-02 22:41:26 +01:00
Gael Guennebaud	fe696022ec	Operators += and -= do not resize!	2016-12-02 22:40:25 +01:00
Angelos Mantzaflaris	18de92329e	use numext::abs (grafted from 0a08d4c60b652d1f24b2fa062c818c4b93890c59 )	2016-12-02 11:48:06 +01:00
Angelos Mantzaflaris	e8a6aa518e	1. Add explicit template to abs2 (resolves deduction for some arithmetic types) 2. Avoid signed-unsigned conversion in comparison (warning in case Scalar is unsigned) (grafted from 4086187e49760d4bde72750dfa20ae9451263417 )	2016-12-02 11:39:18 +01:00
Gael Guennebaud	a6b971e291	Fix memory leak in Ref<Sparse>	2016-12-05 16:59:30 +01:00
Gael Guennebaud	8640ffac65	Optimize SparseLU::solve for rhs vectors	2016-12-05 15:41:14 +01:00
Gael Guennebaud	62acd67903	remove temporary in SparseLU::solve	2016-12-05 15:11:57 +01:00
Gael Guennebaud	0db6d5b3f4	bug #1356 : fix calls to evaluator::coeffRef(0,0) to get the address of the destination by adding a dstDataPtr() member to the kernel. This fixes undefined behavior if dst is empty (nullptr).	2016-12-05 15:08:09 +01:00
Gael Guennebaud	91003f3b86	typo	2016-12-05 13:51:07 +01:00
Gael Guennebaud	e3f613cbd4	Improve performance of row-major-dense-matrix * vector products for recent CPUs. This revised version does not bother about aligned loads/stores, and rather processes 8 rows at ones for better instruction pipelining.	2016-12-05 13:02:01 +01:00
Gael Guennebaud	3abc827354	Clean debugging code	2016-12-05 12:59:32 +01:00
Benoit Steiner	462c28e77a	Merged in srvasude/eigen (pull request PR-265) Add Expm1 support to Eigen.	2016-12-05 02:31:11 +00:00
Gael Guennebaud	6a5fe86098	Complete rewrite of column-major-matrix * vector product to deliver higher performance of modern CPU. The previous code has been optimized for Intel core2 for which unaligned loads/stores were prohibitively expensive. This new version exhibits much higher instruction independence (better pipelining) and explicitly leverage FMA. According to my benchmark, on Haswell this new kernel is always faster than the previous one, and sometimes even twice as fast. Even higher performance could be achieved with a better blocking size heuristic and, perhaps, with explicit prefetching. We should also check triangular product/solve to optimally exploit this new kernel (working on vertical panel of 4 columns is probably not optimal anymore).	2016-12-03 21:14:14 +01:00
Christoph Hertzberg	22f7d398e2	bug #1355 : Fixed wrong line-endings on two files	2016-12-02 11:22:05 +01:00
Gael Guennebaud	27873008d4	Clean up SparseCore module regarding ReverseInnerIterator	2016-12-01 21:55:10 +01:00
Angelos Mantzaflaris	8c24723a09	typo UIntPtr (grafted from b6f04a2dd4d68fe1858524709813a5df5b9a085b )	2016-12-01 21:25:58 +01:00
Angelos Mantzaflaris	aeba0d8655	fix two warnings(unused typedef, unused variable) and a typo (grafted from a9aa3bcf50d55b63c8adb493a06c903ec34251c6 )	2016-12-01 21:23:43 +01:00
Gael Guennebaud	181138a1cb	fix member order	2016-12-01 17:06:20 +01:00
Gael Guennebaud	9f297d57ae	Merged in rmlarsen/eigen (pull request PR-256) Add a default constructor for the "fake" __half class when not using the __half class provided by CUDA.	2016-12-01 15:27:33 +00:00
Benoit Steiner	7ff26ddcbb	Merged eigen/eigen into default	2016-12-01 07:13:17 -08:00
Gael Guennebaud	037b46762d	Fix misleading-indentation warnings.	2016-12-01 16:05:42 +01:00
Mehdi Goli	79aa2b784e	Adding sycl backend for TensorPadding.h; disbaling __unit128 for sycl in TensorIntDiv.h; disabling cashsize for sycl in tensorDeviceDefault.h; adding sycl backend for StrideSliceOP ; removing sycl compiler warning for creating an array of size 0 in CXX11Meta.h; cleaning up the sycl backend code.	2016-12-01 13:02:27 +00:00
Benoit Steiner	fd1dc3363e	Merged eigen/eigen into default	2016-11-30 20:16:17 -08:00
Gael Guennebaud	8df272af88	Fix slection of product implementation for dynamic size matrices with fixed max size.	2016-11-30 22:21:33 +01:00
Gael Guennebaud	c927af60ed	Fix a performance regression in (matmat)vec for which mat*mat was evaluated multiple times.	2016-11-30 17:59:13 +01:00
Gael Guennebaud	ab4ef5e66e	bug #1351 : fix compilation of random with old compilers	2016-11-30 17:37:53 +01:00
Rasmus Munk Larsen	a0329f64fb	Add a default constructor for the "fake" __half class when not using the __half class provided by CUDA.	2016-11-29 13:18:09 -08:00
Benoit Steiner	9f8fbd9434	Merged eigen/eigen into default	2016-11-26 11:28:25 -08:00
Mehdi Goli	7318daf887	Fixing LLVM error on TensorMorphingSycl.h on GPU; fixing int64_t crash for tensor_broadcast_sycl on GPU; adding get_sycl_supported_devices() on syclDevice.h.	2016-11-25 16:19:07 +00:00
Benoit Steiner	3be1afca11	Disabled the "remove the call to 'std::abs' since unsigned values cannot be negative" warning introduced in clang 3.5	2016-11-23 18:49:51 -08:00

... 2 3 4 5 6 ...

5136 Commits