eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-08-02 02:00:37 +08:00

Author	SHA1	Message	Date
Erik Schultheis	421cbf0866	Replace Eigen type metaprogramming with corresponding std types and make use of alias templates	2022-03-16 16:43:40 +00:00
Erik Schultheis	c20e908ebc	turn some macros intro constexpr functions	2021-12-10 19:27:01 +00:00
Erik Schultheis	ec2fd0f7ed	Require recent GCC and MSCV and removed `EIGEN_HAS_CXX14` and some other feature test macros	2021-12-01 00:48:34 +00:00
Rasmus Munk Larsen	6cadab6896	Clean up EIGEN_STATIC_ASSERT to only use standard c++11 static_assert.	2021-09-16 20:43:54 +00:00
Rasmus Munk Larsen	d7d0bf832d	Issue an error in case of direct inclusion of internal headers.	2021-09-10 19:12:26 +00:00
Antonio Sanchez	2b410ecbef	Workaround VS 2017 arg bug. In VS 2017, `std::arg` for real inputs always returns 0, even for negative inputs. It should return `PI` for negative real values. This seems to be fixed in VS 2019 (MSVC 1920).	2021-08-18 18:39:18 +00:00
Nathan Luehr	7e6a1c129c	Device implementation of log for std::complex types.	2021-05-11 22:02:21 +00:00
Rohit Santhanam	39ec31c0ad	Fix for issue where numext::imag and numext::real are used before they are defined.	2021-05-10 19:48:32 +00:00
Antonio Sanchez	c0eb5f89a4	Restore ABI compatibility for conj with 3.3, fix conflict with boost. The boost library unfortunately specializes `conj` for various types and assumes the original two-template-parameter version. This changes restores the second parameter. This also restores ABI compatibility. The specialization for `std::complex` is because `std::conj` is not a device function. For custom complex scalar types, users should provide their own `conj` implementation. We may consider removing the unnecessary second parameter in the future - but this will require modifying boost as well. Fixes #2112.	2021-05-07 18:14:00 +00:00
Antonio Sanchez	90e9a33e1c	Fix numext::arg return type. The cxx11 path for `numext::arg` incorrectly returned the complex type instead of the real type, leading to compile errors. Fixed this and added tests. Related to !477, which uncovered the issue.	2021-05-07 16:26:57 +00:00
Antonio Sanchez	87729ea39f	Eliminate `round_impl` double-promotion warnings for c++03.	2021-03-25 16:52:19 +00:00
Steve Bronder	e7b8643d70	Revert "Revert "Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), innerStride(), outerStride(), and size()"" This reverts commit 5f0b4a4010af4cbf6161a0d1a03a747addc44a5d.	2021-03-24 18:14:56 +00:00
Antonio Sanchez	14b7ebea11	Fix numext::round pre c++11 for large inputs. This is to resolve an issue for large inputs when +0.5 can actually lead to +1 if the input doesn't have enough precision to resolve the addition - leading to an off-by-one error. See discussion on 9a663973.	2021-03-15 19:08:04 +00:00
David Tellenbach	5f0b4a4010	Revert "Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), innerStride(), outerStride(), and size()" This reverts commit 6cbb3038ac48cb5fe17eba4dfbf26e3e798041f1 because it breaks clang-10 builds on x86 and aarch64 when C++11 is enabled.	2021-03-05 13:16:43 +01:00
Steve Bronder	6cbb3038ac	Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), innerStride(), outerStride(), and size()	2021-03-04 18:58:08 +00:00
Antonio Sanchez	f0e46ed5d4	Fix pow and other cwise ops for half/bfloat16. The new `generic_pow` implementation was failing for half/bfloat16 since their construction from int/float is not `constexpr`. Modified in `GenericPacketMathFunctions` to remove `constexpr`. While adding tests for half/bfloat16, found other issues related to implicit conversions. Also needed to implement `numext::arg` for non-integer, non-complex, non-float/double/long double types. These seem to be implicitly converted to `std::complex<T>`, which then fails for half/bfloat16.	2021-01-22 11:10:54 -08:00
Antonio Sanchez	bde6741641	Improved std::complex sqrt and rsqrt. Replaces `std::sqrt` with `complex_sqrt` for all platforms (previously `complex_sqrt` was only used for CUDA and MSVC), and implements custom `complex_rsqrt`. Also introduces `numext::rsqrt` to simplify implementation, and modified `numext::hypot` to adhere to IEEE IEC 6059 for special cases. The `complex_sqrt` and `complex_rsqrt` implementations were found to be significantly faster than `std::sqrt<std::complex<T>>` and `1/numext::sqrt<std::complex<T>>`. Benchmark file attached. ``` GCC 10, Intel Xeon, x86_64: --------------------------------------------------------------------------- Benchmark Time CPU Iterations --------------------------------------------------------------------------- BM_Sqrt<std::complex<float>> 9.21 ns 9.21 ns 73225448 BM_StdSqrt<std::complex<float>> 17.1 ns 17.1 ns 40966545 BM_Sqrt<std::complex<double>> 8.53 ns 8.53 ns 81111062 BM_StdSqrt<std::complex<double>> 21.5 ns 21.5 ns 32757248 BM_Rsqrt<std::complex<float>> 10.3 ns 10.3 ns 68047474 BM_DivSqrt<std::complex<float>> 16.3 ns 16.3 ns 42770127 BM_Rsqrt<std::complex<double>> 11.3 ns 11.3 ns 61322028 BM_DivSqrt<std::complex<double>> 16.5 ns 16.5 ns 42200711 Clang 11, Intel Xeon, x86_64: --------------------------------------------------------------------------- Benchmark Time CPU Iterations --------------------------------------------------------------------------- BM_Sqrt<std::complex<float>> 7.46 ns 7.45 ns 90742042 BM_StdSqrt<std::complex<float>> 16.6 ns 16.6 ns 42369878 BM_Sqrt<std::complex<double>> 8.49 ns 8.49 ns 81629030 BM_StdSqrt<std::complex<double>> 21.8 ns 21.7 ns 31809588 BM_Rsqrt<std::complex<float>> 8.39 ns 8.39 ns 82933666 BM_DivSqrt<std::complex<float>> 14.4 ns 14.4 ns 48638676 BM_Rsqrt<std::complex<double>> 9.83 ns 9.82 ns 70068956 BM_DivSqrt<std::complex<double>> 15.7 ns 15.7 ns 44487798 Clang 9, Pixel 2, aarch64: --------------------------------------------------------------------------- Benchmark Time CPU Iterations --------------------------------------------------------------------------- BM_Sqrt<std::complex<float>> 24.2 ns 24.1 ns 28616031 BM_StdSqrt<std::complex<float>> 104 ns 103 ns 6826926 BM_Sqrt<std::complex<double>> 31.8 ns 31.8 ns 22157591 BM_StdSqrt<std::complex<double>> 128 ns 128 ns 5437375 BM_Rsqrt<std::complex<float>> 31.9 ns 31.8 ns 22384383 BM_DivSqrt<std::complex<float>> 99.2 ns 98.9 ns 7250438 BM_Rsqrt<std::complex<double>> 46.0 ns 45.8 ns 15338689 BM_DivSqrt<std::complex<double>> 119 ns 119 ns 5898944 ```	2021-01-17 08:50:57 -08:00
Deven Desai	2a6addb4f9	Fix for breakage in ROCm support - 210108 The following commit breaks ROCm support for Eigen `f149e0ebc3` All unit tests fail with the following error ``` Building HIPCC object test/CMakeFiles/gpu_basic.dir/gpu_basic_generated_gpu_basic.cu.o In file included from /home/rocm-user/eigen/test/gpu_basic.cu:19: In file included from /home/rocm-user/eigen/test/main.h:356: In file included from /home/rocm-user/eigen/Eigen/QR:11: In file included from /home/rocm-user/eigen/Eigen/Core:166: /home/rocm-user/eigen/Eigen/src/Core/MathFunctionsImpl.h:105:35: error: __host__ __device__ function 'complex_sqrt' cannot overload __host__ function 'complex_sqrt' EIGEN_DEVICE_FUNC std::complex<T> complex_sqrt(const std::complex<T>& z) { ^ /home/rocm-user/eigen/Eigen/src/Core/MathFunctions.h:342:38: note: previous declaration is here template<typename T> std::complex<T> complex_sqrt(const std::complex<T>& a_x); ^ 1 error generated when compiling for gfx900. CMake Error at gpu_basic_generated_gpu_basic.cu.o.cmake:192 (message): Error generating file /home/rocm-user/eigen/build/test/CMakeFiles/gpu_basic.dir//./gpu_basic_generated_gpu_basic.cu.o test/CMakeFiles/gpu_basic.dir/build.make:63: recipe for target 'test/CMakeFiles/gpu_basic.dir/gpu_basic_generated_gpu_basic.cu.o' failed make[3]: * [test/CMakeFiles/gpu_basic.dir/gpu_basic_generated_gpu_basic.cu.o] Error 1 CMakeFiles/Makefile2:16618: recipe for target 'test/CMakeFiles/gpu_basic.dir/all' failed make[2]: * [test/CMakeFiles/gpu_basic.dir/all] Error 2 CMakeFiles/Makefile2:16625: recipe for target 'test/CMakeFiles/gpu_basic.dir/rule' failed make[1]: * [test/CMakeFiles/gpu_basic.dir/rule] Error 2 Makefile:5401: recipe for target 'gpu_basic' failed make: * [gpu_basic] Error 2 ``` The error message is accurate, and the fix (provided in thsi commit) is trivial.	2021-01-08 18:04:40 +00:00
Antonio Sanchez	f149e0ebc3	Fix MSVC complex sqrt and packetmath test. MSVC incorrectly handles `inf` cases for `std::sqrt<std::complex<T>>`. Here we replace it with a custom version (currently used on GPU). Also fixed the `packetmath` test, which previously skipped several corner cases since `CHECK_CWISE1` only tests the first `PacketSize` elements.	2021-01-08 01:17:19 +00:00
Antonio Sanchez	070d303d56	Add CUDA complex sqrt. This is to support scalar `sqrt` of complex numbers `std::complex<T>` on device, requested by Tensorflow folks. Technically `std::complex` is not supported by NVCC on device (though it is by clang), so the default `sqrt(std::complex<T>)` function only works on the host. Here we create an overload to add back the functionality. Also modified the CMake file to add `--relaxed-constexpr` (or equivalent) flag for NVCC to allow calling constexpr functions from device functions, and added support for specifying compute architecture for NVCC (was already available for clang).	2020-12-22 23:25:23 -08:00
Antonio Sanchez	c6efc4e0ba	Replace M_LOG2E and M_LN2 with custom macros. For these to exist we would need to define `_USE_MATH_DEFINES` before `cmath` or `math.h` is first included. However, we don't control the include order for projects outside Eigen, so even defining the macro in `Eigen/Core` does not fix the issue for projects that end up including `<cmath>` before Eigen does (explicitly or transitively). To fix this, we define `EIGEN_LOG2E` and `EIGEN_LN2` ourselves.	2020-12-11 14:34:31 -08:00
Antonio Sanchez	22f67b5958	Fix boolean float conversion and product warnings. This fixes some gcc warnings such as: ``` Eigen/src/Core/GenericPacketMath.h:655:63: warning: implicit conversion turns floating-point number into bool: 'typename __gnu_cxx::__enable_if<__is_integer<bool>::__value, double>::__type' (aka 'double') to 'bool' [-Wimplicit-conversion-floating-point-to-bool] Packet psqrt(const Packet& a) { EIGEN_USING_STD(sqrt); return sqrt(a); } ``` Details: - Added `scalar_sqrt_op<bool>` (`-Wimplicit-conversion-floating-point-to-bool`). - Added `scalar_square_op<bool>` and `scalar_cube_op<bool>` specializations (`-Wint-in-bool-context`) - Deprecated above specialized ops for bool. - Modified `cxx11_tensor_block_eval` to specialize generator for booleans (`-Wint-in-bool-context`) and to use `abs` instead of `square` to avoid deprecated bool ops.	2020-11-24 20:20:36 +00:00
Antonio Sanchez	bb69a8db5d	Explicit casts of S -> std::complex<T> When calling `internal::cast<S, std::complex<T>>(x)`, clang often generates an implicit conversion warning due to an implicit cast from type `S` to `T`. This currently affects the following tests: - `basicstuff` - `bfloat16_float` - `cxx11_tensor_casts` The implicit cast leads to widening/narrowing float conversions. Widening warnings only seem to be generated by clang (`-Wdouble-promotion`). To eliminate the warning, we explicitly cast the real-component first from `S` to `T`. We also adjust tests to use `internal::cast` instead of `static_cast` when a complex type may be involved.	2020-11-14 05:50:42 +00:00
David Tellenbach	e265f7ed8e	Add support for Armv8.2-a __fp16 Armv8.2-a provides a native half-precision floating point (__fp16 aka. float16_t). This patch introduces * __fp16 as underlying type of Eigen::half if this type is available * the packet types Packet4hf and Packet8hf representing float16x4_t and float16x8_t respectively * packet-math for the above packets with corresponding scalar type Eigen::half The packet-math functionality has been implemented by Ashutosh Sharma <ashutosh.sharma@amperecomputing.com>. This closes #1940.	2020-10-28 20:15:09 +00:00
David Tellenbach	4091f6b25c	Drop EIGEN_USING_STD_MATH in favour of EIGEN_USING_STD	2020-10-09 02:05:05 +02:00
Rasmus Munk Larsen	ee714f79f7	Fix CUDA build breakage and incorrect result for absdiff on HIP with long double arguments.	2020-10-02 21:05:35 +00:00
Pedro Caldeira	65e400896b	Fix bfloat16 round on gcc 4.8	2020-09-14 10:43:59 -03:00
Alexander Neumann	5272106826	remove semi triggering -Wextra-semi-stmt	2020-09-07 11:42:30 +02:00
Joel Holdsworth	d5c665742b	Add absolute_difference coefficient-wise binary Array function	2020-03-19 17:45:20 +00:00
Rasmus Munk Larsen	e1ecfc162d	call Explicitly ::rint and ::rintf for targets without c++11. Without this, the Windows build breaks when trying to compile numext::rint<double>.	2020-01-10 21:14:08 +00:00
Rasmus Munk Larsen	9254974115	Don't add EIGEN_DEVICE_FUNC to random() since ::rand is not available in Cuda.	2020-01-09 21:23:09 +00:00
Rasmus Munk Larsen	a3ec89b5bd	Add missing EIGEN_DEVICE_FUNC annotations in MathFunctions.h.	2020-01-09 21:06:34 +00:00
Ilya Tokar	19876ced76	Bug #1785 : Introduce numext::rint. This provides a new op that matches std::rint and previous behavior of pround. Also adds corresponding unsupported/../Tensor op. Performance is the same as e. g. floor (tested SSE/AVX).	2020-01-07 21:22:44 +00:00
Christoph Hertzberg	8e5da71466	Resolve double-promotion warnings when compiling with clang. `sin` was calling `sin(double)` instead of `std::sin(float)`	2019-12-13 22:46:40 +01:00
Srinivas Vasudevan	88062b7fed	Fix implementation of complex expm1. Add tests that fail with previous implementation, but pass with the current one.	2019-12-12 01:56:54 +00:00
Gael Guennebaud	87427d2eaa	PR 719: fix real/imag namespace conflict	2019-10-08 09:15:17 +02:00
Rasmus Munk Larsen	13ef08e5ac	Move implementation of vectorized error function erf() to SpecialFunctionsImpl.h.	2019-09-27 13:56:04 -07:00
Eugene Zhulenev	0c845e28c9	Fix erf in c++03	2019-09-25 11:31:45 -07:00
Deven Desai	5e186b1987	Fix for the HIP build+test errors. The errors were introduced by this commit : `d38e6fbc27` After the above mentioned commit, some of the tests started failing with the following error ``` Building HIPCC object unsupported/test/CMakeFiles/cxx11_tensor_reduction_gpu_5.dir/cxx11_tensor_reduction_gpu_5_generated_cxx11_tensor_reduction_gpu.cu.o In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:29: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/../SpecialFunctions:70: /home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsHalf.h:28:22: error: call to 'erf' is ambiguous return Eigen::half(Eigen::numext::erf(static_cast<float>(a))); ^~~~~~~~~~~~~~~~~~ /home/rocm-user/eigen/unsupported/test/../../Eigen/src/Core/MathFunctions.h:1600:7: note: candidate function [with T = float] float erf(const float &x) { return ::erff(x); } ^ /home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h:1897:5: note: candidate function [with Scalar = float] erf(const Scalar& x) { ^ In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:29: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/../SpecialFunctions:75: /home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/GPU/GpuSpecialFunctions.h:87:23: error: call to 'erf' is ambiguous return make_double2(erf(a.x), erf(a.y)); ^~~ /home/rocm-user/eigen/unsupported/test/../../Eigen/src/Core/MathFunctions.h:1603:8: note: candidate function [with T = double] double erf(const double &x) { return ::erf(x); } ^ /home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h:1897:5: note: candidate function [with Scalar = double] erf(const Scalar& x) { ^ In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:29: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/../SpecialFunctions:75: /home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/GPU/GpuSpecialFunctions.h:87:33: error: call to 'erf' is ambiguous return make_double2(erf(a.x), erf(a.y)); ^~~ /home/rocm-user/eigen/unsupported/test/../../Eigen/src/Core/MathFunctions.h:1603:8: note: candidate function [with T = double] double erf(const double &x) { return ::erf(x); } ^ /home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h:1897:5: note: candidate function [with Scalar = double] erf(const Scalar& x) { ^ 3 errors generated. ``` This PR fixes the compile error by removing the "old" implementation for "erf" (assuming that the "new" implementation is what we want going forward. from a GPU point-of-view both implementations are the same). This PR also fixes what seems like a cut-n-paste error in the aforementioned commit	2019-09-25 15:39:13 +00:00
Rasmus Munk Larsen	6de5ed08d8	Add generic PacketMath implementation of the Error Function (erf).	2019-09-19 12:48:30 -07:00
Rasmus Munk Larsen	1187bb65ad	Add more tests for corner cases of log1p and expm1. Add handling of infinite arguments to log1p such that log1p(inf) = inf.	2019-08-28 12:20:21 -07:00
Rasmus Munk Larsen	9aba527405	Revert changes to std_falback::log1p that broke handling of arguments less than -1. Fix packet op accordingly.	2019-08-27 15:35:29 -07:00
Rasmus Munk Larsen	a3298b22ec	Implement vectorized versions of log1p and expm1 in Eigen using Kahan's formulas, and change the scalar implementations to properly handle infinite arguments. Depending on instruction set, significant speedups are observed for the vectorized path: log1p wall time is reduced 60-93% (2.5x - 15x speedup) expm1 wall time is reduced 0-85% (1x - 7x speedup) The scalar path is slower by 20-30% due to the extra branch needed to handle +infinity correctly. Full benchmarks measured on Intel(R) Xeon(R) Gold 6154 here: https://bitbucket.org/snippets/rmlarsen/MXBkpM	2019-08-12 13:53:28 -07:00
Rasmus Munk Larsen	d55d392e7b	Fix bugs in log1p and expm1 where repeated using statements would clobber each other. Add specializations for complex types since std::log1p and std::exp1m do not support complex.	2019-08-08 16:27:32 -07:00
Mehdi Goli	16a56b2ddd	[SYCL] This PR adds the minimum modifications to Eigen core required to run Eigen unsupported modules on devices supporting SYCL. * Adding SYCL memory model * Enabling/Disabling SYCL backend in Core * Supporting Vectorization	2019-06-27 12:25:09 +01:00
Gael Guennebaud	774bb9d6f7	fix a doxygen issue	2018-10-08 09:30:15 +02:00
Mehdi Goli	7ec8b40ad9	Collapsed revision * Separating SYCL math function. * Converting function overload to function specialisation. * Applying the suggested design.	2018-08-28 14:20:48 +01:00
Alexey Frunze	050bcf6126	bug #1584 : Improve random (avoid undefined behavior).	2018-08-08 20:19:32 -07:00
Deven Desai	876f392c39	Updates corresponding to the latest round of PR feedback The major changes are 1. Moving CUDA/PacketMath.h to GPU/PacketMath.h 2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h 3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h The above three changes effectively enable the Eigen "Packet" layer for the HIP platform 4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic") 5. Updating the "EIGEN_DEVICE_FUNC" marking in some places The change has been tested on the HIP and CUDA platforms.	2018-07-11 10:39:54 -04:00
Deven Desai	38807a2575	merging updates from upstream	2018-07-11 09:17:33 -04:00

1 2 3 4 5

250 Commits