eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-10-20 03:51:06 +08:00

Author	SHA1	Message	Date
Gael Guennebaud	fe696022ec	Operators += and -= do not resize!	2016-12-02 22:40:25 +01:00
Angelos Mantzaflaris	18de92329e	use numext::abs (grafted from 0a08d4c60b652d1f24b2fa062c818c4b93890c59 )	2016-12-02 11:48:06 +01:00
Angelos Mantzaflaris	e8a6aa518e	1. Add explicit template to abs2 (resolves deduction for some arithmetic types) 2. Avoid signed-unsigned conversion in comparison (warning in case Scalar is unsigned) (grafted from 4086187e49760d4bde72750dfa20ae9451263417 )	2016-12-02 11:39:18 +01:00
Gael Guennebaud	0db6d5b3f4	bug #1356 : fix calls to evaluator::coeffRef(0,0) to get the address of the destination by adding a dstDataPtr() member to the kernel. This fixes undefined behavior if dst is empty (nullptr).	2016-12-05 15:08:09 +01:00
Gael Guennebaud	91003f3b86	typo	2016-12-05 13:51:07 +01:00
Gael Guennebaud	e3f613cbd4	Improve performance of row-major-dense-matrix * vector products for recent CPUs. This revised version does not bother about aligned loads/stores, and rather processes 8 rows at ones for better instruction pipelining.	2016-12-05 13:02:01 +01:00
Gael Guennebaud	3abc827354	Clean debugging code	2016-12-05 12:59:32 +01:00
Benoit Steiner	462c28e77a	Merged in srvasude/eigen (pull request PR-265) Add Expm1 support to Eigen.	2016-12-05 02:31:11 +00:00
Gael Guennebaud	6a5fe86098	Complete rewrite of column-major-matrix * vector product to deliver higher performance of modern CPU. The previous code has been optimized for Intel core2 for which unaligned loads/stores were prohibitively expensive. This new version exhibits much higher instruction independence (better pipelining) and explicitly leverage FMA. According to my benchmark, on Haswell this new kernel is always faster than the previous one, and sometimes even twice as fast. Even higher performance could be achieved with a better blocking size heuristic and, perhaps, with explicit prefetching. We should also check triangular product/solve to optimally exploit this new kernel (working on vertical panel of 4 columns is probably not optimal anymore).	2016-12-03 21:14:14 +01:00
Angelos Mantzaflaris	8c24723a09	typo UIntPtr (grafted from b6f04a2dd4d68fe1858524709813a5df5b9a085b )	2016-12-01 21:25:58 +01:00
Angelos Mantzaflaris	aeba0d8655	fix two warnings(unused typedef, unused variable) and a typo (grafted from a9aa3bcf50d55b63c8adb493a06c903ec34251c6 )	2016-12-01 21:23:43 +01:00
Gael Guennebaud	9f297d57ae	Merged in rmlarsen/eigen (pull request PR-256) Add a default constructor for the "fake" __half class when not using the __half class provided by CUDA.	2016-12-01 15:27:33 +00:00
Benoit Steiner	7ff26ddcbb	Merged eigen/eigen into default	2016-12-01 07:13:17 -08:00
Gael Guennebaud	037b46762d	Fix misleading-indentation warnings.	2016-12-01 16:05:42 +01:00
Mehdi Goli	79aa2b784e	Adding sycl backend for TensorPadding.h; disbaling __unit128 for sycl in TensorIntDiv.h; disabling cashsize for sycl in tensorDeviceDefault.h; adding sycl backend for StrideSliceOP ; removing sycl compiler warning for creating an array of size 0 in CXX11Meta.h; cleaning up the sycl backend code.	2016-12-01 13:02:27 +00:00
Benoit Steiner	fd1dc3363e	Merged eigen/eigen into default	2016-11-30 20:16:17 -08:00
Gael Guennebaud	8df272af88	Fix slection of product implementation for dynamic size matrices with fixed max size.	2016-11-30 22:21:33 +01:00
Gael Guennebaud	c927af60ed	Fix a performance regression in (matmat)vec for which mat*mat was evaluated multiple times.	2016-11-30 17:59:13 +01:00
Gael Guennebaud	ab4ef5e66e	bug #1351 : fix compilation of random with old compilers	2016-11-30 17:37:53 +01:00
Rasmus Munk Larsen	a0329f64fb	Add a default constructor for the "fake" __half class when not using the __half class provided by CUDA.	2016-11-29 13:18:09 -08:00
Benoit Steiner	9f8fbd9434	Merged eigen/eigen into default	2016-11-26 11:28:25 -08:00
Mehdi Goli	7318daf887	Fixing LLVM error on TensorMorphingSycl.h on GPU; fixing int64_t crash for tensor_broadcast_sycl on GPU; adding get_sycl_supported_devices() on syclDevice.h.	2016-11-25 16:19:07 +00:00
Benoit Steiner	3be1afca11	Disabled the "remove the call to 'std::abs' since unsigned values cannot be negative" warning introduced in clang 3.5	2016-11-23 18:49:51 -08:00
Mehdi Goli	b8cc5635d5	Removing unsupported device from test case; cleaning the tensor device sycl.	2016-11-23 16:30:41 +00:00
Gael Guennebaud	e340866c81	Fix compilation with gcc and old ABI version	2016-11-23 14:04:57 +01:00
Gael Guennebaud	74637fa4e3	Optimize predux<Packet8f> (AVX)	2016-11-22 21:57:52 +01:00
Gael Guennebaud	178c084856	Disable usage of SSE3 _mm_hadd_ps that is extremely slow.	2016-11-22 21:53:14 +01:00
Gael Guennebaud	7dd894e40e	Optimize predux<Packet4d> (AVX)	2016-11-22 21:41:30 +01:00
Gael Guennebaud	f3fb0a1940	Disable usage of SSE3 haddpd that is extremely slow.	2016-11-22 16:58:31 +01:00
Benoit Steiner	ed839c5851	Enable the use of constant expressions with clang >= 3.6	2016-11-20 10:34:49 -08:00
Gael Guennebaud	465ede0f20	Fix compilation issue in mat = permutation (regression introduced in 8193ffb3d38b56c9295f204dc57dc6bac74f58aa )	2016-11-20 09:41:37 +01:00
Benoit Steiner	1bdf1b9ce0	Merged in benoitsteiner/opencl (pull request PR-253) OpenCL improvements	2016-11-19 04:44:43 +00:00
Benoit Steiner	8649e16c2a	Enable EIGEN_HAS_C99_MATH when building with the latest version of Visual Studio	2016-11-18 14:18:34 -08:00
Gael Guennebaud	164414c563	Merged in ChunW/eigen (pull request PR-252) Workaround for error in VS2012 with /clr	2016-11-18 21:07:29 +00:00
Luke Iwanski	5159675c33	Added isnan, isfinite and isinf for SYCL device. Plus test for that.	2016-11-18 16:01:48 +00:00
Gael Guennebaud	8193ffb3d3	bug #1343 : fix compilation regression in mat+=selfadjoint_view. Generic EigenBase2EigenBase assignment was incomplete.	2016-11-18 10:17:34 +01:00
Gael Guennebaud	cebff7e3a2	bug #1343 : fix compilation regression in array = matrix_product	2016-11-18 10:09:33 +01:00
Benoit Steiner	7c30078b9f	Merged eigen/eigen into default	2016-11-17 22:53:37 -08:00
Chun Wang	0d0948c3b9	Workaround for error in VS2012 with /clr	2016-11-17 17:54:27 -05:00
Konstantinos Margaritis	672aa97d4d	implement float/std::complex<float> for ZVector as well, minor fixes to ZVector	2016-11-17 13:27:33 -05:00
Luke Iwanski	c5130dedbe	Specialised basic math functions for SYCL device.	2016-11-17 11:47:13 +00:00
Gael Guennebaud	7b09e4dd8c	bump default branch to 3.3.90	2016-11-16 22:20:58 +01:00
Benoit Steiner	dff9a049c4	Optimized the computation of exp, sqrt, ceil anf floor for fp16 on Pascal GPUs	2016-11-16 09:01:51 -08:00
Gael Guennebaud	0ee92aa38e	Optimize sparse<bool> && sparse<bool> to use the same path as for coeff-wise products.	2016-11-14 18:47:41 +01:00
Gael Guennebaud	eeac81b8c0	bump to 3.3.0	2016-11-10 13:55:14 +01:00
Gael Guennebaud	ba05572dcb	bump to 3.3-rc2	2016-11-04 09:09:06 +01:00
Benoit Steiner	ca0ba0d9a4	Improved AVX512 support	2016-11-03 04:00:49 -07:00
Benoit Steiner	c80587c92b	Merged eigen/eigen into default	2016-11-03 03:55:11 -07:00
Gael Guennebaud	3f1d0cdc22	bug #1337 : improve doc of homogeneous() and hnormalized()	2016-11-03 11:03:08 +01:00
Gael Guennebaud	78e93ac1ad	bug #1330 : Cholmod supports double precision only, so let's trigger a static assertion if the scalar type does not match this requirement.	2016-11-03 10:21:59 +01:00

... 15 16 17 18 19 ...

4166 Commits