eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-10-18 11:01:28 +08:00

Author	SHA1	Message	Date
Rasmus Munk Larsen	e1ecfc162d	call Explicitly ::rint and ::rintf for targets without c++11. Without this, the Windows build breaks when trying to compile numext::rint<double>.	2020-01-10 21:14:08 +00:00
Rasmus Munk Larsen	9254974115	Don't add EIGEN_DEVICE_FUNC to random() since ::rand is not available in Cuda.	2020-01-09 21:23:09 +00:00
Rasmus Munk Larsen	a3ec89b5bd	Add missing EIGEN_DEVICE_FUNC annotations in MathFunctions.h.	2020-01-09 21:06:34 +00:00
Ilya Tokar	19876ced76	Bug #1785 : Introduce numext::rint. This provides a new op that matches std::rint and previous behavior of pround. Also adds corresponding unsupported/../Tensor op. Performance is the same as e. g. floor (tested SSE/AVX).	2020-01-07 21:22:44 +00:00
Christoph Hertzberg	8e5da71466	Resolve double-promotion warnings when compiling with clang. `sin` was calling `sin(double)` instead of `std::sin(float)`	2019-12-13 22:46:40 +01:00
Srinivas Vasudevan	88062b7fed	Fix implementation of complex expm1. Add tests that fail with previous implementation, but pass with the current one.	2019-12-12 01:56:54 +00:00
Gael Guennebaud	87427d2eaa	PR 719: fix real/imag namespace conflict	2019-10-08 09:15:17 +02:00
Rasmus Munk Larsen	13ef08e5ac	Move implementation of vectorized error function erf() to SpecialFunctionsImpl.h.	2019-09-27 13:56:04 -07:00
Eugene Zhulenev	0c845e28c9	Fix erf in c++03	2019-09-25 11:31:45 -07:00
Deven Desai	5e186b1987	Fix for the HIP build+test errors. The errors were introduced by this commit : `d38e6fbc27` After the above mentioned commit, some of the tests started failing with the following error ``` Building HIPCC object unsupported/test/CMakeFiles/cxx11_tensor_reduction_gpu_5.dir/cxx11_tensor_reduction_gpu_5_generated_cxx11_tensor_reduction_gpu.cu.o In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:29: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/../SpecialFunctions:70: /home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsHalf.h:28:22: error: call to 'erf' is ambiguous return Eigen::half(Eigen::numext::erf(static_cast<float>(a))); ^~~~~~~~~~~~~~~~~~ /home/rocm-user/eigen/unsupported/test/../../Eigen/src/Core/MathFunctions.h:1600:7: note: candidate function [with T = float] float erf(const float &x) { return ::erff(x); } ^ /home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h:1897:5: note: candidate function [with Scalar = float] erf(const Scalar& x) { ^ In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:29: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/../SpecialFunctions:75: /home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/GPU/GpuSpecialFunctions.h:87:23: error: call to 'erf' is ambiguous return make_double2(erf(a.x), erf(a.y)); ^~~ /home/rocm-user/eigen/unsupported/test/../../Eigen/src/Core/MathFunctions.h:1603:8: note: candidate function [with T = double] double erf(const double &x) { return ::erf(x); } ^ /home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h:1897:5: note: candidate function [with Scalar = double] erf(const Scalar& x) { ^ In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:29: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/../SpecialFunctions:75: /home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/GPU/GpuSpecialFunctions.h:87:33: error: call to 'erf' is ambiguous return make_double2(erf(a.x), erf(a.y)); ^~~ /home/rocm-user/eigen/unsupported/test/../../Eigen/src/Core/MathFunctions.h:1603:8: note: candidate function [with T = double] double erf(const double &x) { return ::erf(x); } ^ /home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h:1897:5: note: candidate function [with Scalar = double] erf(const Scalar& x) { ^ 3 errors generated. ``` This PR fixes the compile error by removing the "old" implementation for "erf" (assuming that the "new" implementation is what we want going forward. from a GPU point-of-view both implementations are the same). This PR also fixes what seems like a cut-n-paste error in the aforementioned commit	2019-09-25 15:39:13 +00:00
Rasmus Munk Larsen	6de5ed08d8	Add generic PacketMath implementation of the Error Function (erf).	2019-09-19 12:48:30 -07:00
Rasmus Munk Larsen	1187bb65ad	Add more tests for corner cases of log1p and expm1. Add handling of infinite arguments to log1p such that log1p(inf) = inf.	2019-08-28 12:20:21 -07:00
Rasmus Munk Larsen	9aba527405	Revert changes to std_falback::log1p that broke handling of arguments less than -1. Fix packet op accordingly.	2019-08-27 15:35:29 -07:00
Rasmus Munk Larsen	a3298b22ec	Implement vectorized versions of log1p and expm1 in Eigen using Kahan's formulas, and change the scalar implementations to properly handle infinite arguments. Depending on instruction set, significant speedups are observed for the vectorized path: log1p wall time is reduced 60-93% (2.5x - 15x speedup) expm1 wall time is reduced 0-85% (1x - 7x speedup) The scalar path is slower by 20-30% due to the extra branch needed to handle +infinity correctly. Full benchmarks measured on Intel(R) Xeon(R) Gold 6154 here: https://bitbucket.org/snippets/rmlarsen/MXBkpM	2019-08-12 13:53:28 -07:00
Rasmus Munk Larsen	d55d392e7b	Fix bugs in log1p and expm1 where repeated using statements would clobber each other. Add specializations for complex types since std::log1p and std::exp1m do not support complex.	2019-08-08 16:27:32 -07:00
Mehdi Goli	16a56b2ddd	[SYCL] This PR adds the minimum modifications to Eigen core required to run Eigen unsupported modules on devices supporting SYCL. * Adding SYCL memory model * Enabling/Disabling SYCL backend in Core * Supporting Vectorization	2019-06-27 12:25:09 +01:00
Gael Guennebaud	774bb9d6f7	fix a doxygen issue	2018-10-08 09:30:15 +02:00
Mehdi Goli	7ec8b40ad9	Collapsed revision * Separating SYCL math function. * Converting function overload to function specialisation. * Applying the suggested design.	2018-08-28 14:20:48 +01:00
Alexey Frunze	050bcf6126	bug #1584 : Improve random (avoid undefined behavior).	2018-08-08 20:19:32 -07:00
Deven Desai	876f392c39	Updates corresponding to the latest round of PR feedback The major changes are 1. Moving CUDA/PacketMath.h to GPU/PacketMath.h 2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h 3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h The above three changes effectively enable the Eigen "Packet" layer for the HIP platform 4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic") 5. Updating the "EIGEN_DEVICE_FUNC" marking in some places The change has been tested on the HIP and CUDA platforms.	2018-07-11 10:39:54 -04:00
Deven Desai	38807a2575	merging updates from upstream	2018-07-11 09:17:33 -04:00
Deven Desai	b6cc0961b1	updates based on PR feedback There are two major changes (and a few minor ones which are not listed here...see PR discussion for details) 1. Eigen::half implementations for HIP and CUDA have been merged. This means that - `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h` - `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h` - `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h` After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install. 2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate. - `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC \|\| EIGEN_HIPCC)` - `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH \|\| EIGEN_HIP_DEVICE_COMPILE)` - `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`	2018-06-14 10:21:54 -04:00
Deven Desai	d1d22ef0f4	syncing this fork with upstream	2018-06-13 12:09:52 -04:00
Andrea Bocci	f7124b3e46	Extend CUDA support to matrix inversion and selfadjointeigensolver	2018-06-11 18:33:24 +02:00
Deven Desai	8fbd47052b	Adding support for using Eigen in HIP kernels. This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs. Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor) Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.	2018-06-06 10:12:58 -04:00
Christoph Hertzberg	e5f9f4768f	Avoid unnecessary C++11 dependency	2018-06-07 15:03:50 +02:00
nicolov	39c2cba810	Add a specialization of Eigen::numext::conj for std::complex<T> to be used when compiling a cuda kernel. This fixes the compilation of TensorFlow 1.4 with clang 6.0 used as CUDA compiler with libc++. This follows the previous change in `2a69290ddb` , which mentions OSX (I guess because it uses libc++ too).	2018-04-13 22:29:10 +00:00
Gael Guennebaud	e43ca0320d	bug #1520 : workaround some -Wfloat-equal warnings by calling std::equal_to	2018-04-11 15:24:13 +02:00
Gael Guennebaud	e116f6847e	bug #1521 : avoid signalling NaN in hypot and make it std::complex<> friendly.	2018-04-04 13:47:23 +02:00
luz.paz	e3912f5e63	MIsc. source and comment typos Found using `codespell` and `grep` from downstream FreeCAD	2018-03-11 10:01:44 -04:00
Yan Facai (颜发才)	42a8334668	ENH: exp supports complex type for cuda	2018-01-04 16:01:01 +08:00
Gael Guennebaud	cda47c42c2	Fix compilation in c++98 mode.	2017-07-17 21:08:20 +02:00
Gael Guennebaud	bbd97b4095	Add a EIGEN_NO_CUDA option, and introduce EIGEN_CUDACC and EIGEN_CUDA_ARCH aliases	2017-07-17 01:02:51 +02:00
Benoit Steiner	c5a241ab9b	Merged in benoitsteiner/opencl (pull request PR-323) Improved support for OpenCL	2017-07-07 16:27:33 +00:00
Benoit Steiner	c92faf9d84	Merged in mehdi_goli/upstr_benoit/HiperbolicOP (pull request PR-13) Adding hyperbolic operations for sycl. * Adding hyperbolic operations. * Adding the hyperbolic operations for CPU as well.	2017-07-06 05:05:57 +00:00
Gael Guennebaud	561f777075	Fix a gcc7 warning about bool * bool in abs2 default implementation.	2017-06-27 12:05:17 +02:00
Gael Guennebaud	498aa95a8b	bug #1424 : add numext::abs specialization for unsigned integer types.	2017-06-09 11:53:49 +02:00
Ilya Biryukov	1c03d43a5c	Fixed compilation with cuda-clang	2017-03-06 12:01:12 +01:00
Srinivas Vasudevan	e6c8b5500c	Change comparisons to use Scalar instead of RealScalar.	2016-12-05 14:01:45 -08:00
Srinivas Vasudevan	218764ee1f	Added support for expm1 in Eigen.	2016-12-02 14:13:01 -08:00
Mehdi Goli	79aa2b784e	Adding sycl backend for TensorPadding.h; disbaling __unit128 for sycl in TensorIntDiv.h; disabling cashsize for sycl in tensorDeviceDefault.h; adding sycl backend for StrideSliceOP ; removing sycl compiler warning for creating an array of size 0 in CXX11Meta.h; cleaning up the sycl backend code.	2016-12-01 13:02:27 +00:00
Luke Iwanski	5159675c33	Added isnan, isfinite and isinf for SYCL device. Plus test for that.	2016-11-18 16:01:48 +00:00
Luke Iwanski	c5130dedbe	Specialised basic math functions for SYCL device.	2016-11-17 11:47:13 +00:00
Benoit Steiner	2a69290ddb	Added a specialization of Eigen::numext::real and Eigen::numext::imag for std::complex<T> to be used when compiling a cuda kernel. This is unfortunately necessary to be able to process complex numbers from a CUDA kernel on MacOS.	2016-09-22 15:52:23 -07:00
Benoit Steiner	50e3bbfc90	Calls x.imag() instead of imag(x) when x is a complex number since the former is a constexpr while the later isn't. This fixes compilation errors triggered by nvcc on Mac.	2016-09-22 13:17:25 -07:00
Benoit Steiner	c0d56a543e	Added several missing EIGEN_DEVICE_FUNC qualifiers	2016-09-14 14:06:21 -07:00
Benoit Steiner	5f50f12d2c	Added the ability to compute the absolute value of a complex number on GPU, as well as a test to catch the problem.	2016-09-12 13:46:13 -07:00
Gael Guennebaud	68d1897e8a	Make sure that our log1p implementation is called as a last resort only.	2016-08-26 15:30:55 +02:00
Gael Guennebaud	fe60856fed	Add overload of numext::log1p for float/double in CUDA	2016-08-26 15:28:59 +02:00
Gael Guennebaud	a4c266f827	Factorize the 4 copies of tanh implementations, make numext::tanh consistent with array::tanh, enable fast tanh in fast-math mode only.	2016-08-23 14:23:08 +02:00

1 2 3 4 5

221 Commits