eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-08-05 03:30:37 +08:00

Author	SHA1	Message	Date
Rasmus Munk Larsen	ce62177b5b	Vectorize atanh & add a missing definition and unit test for atan.	2023-02-21 03:14:05 +00:00
Charles Schlosser	049a144798	Add typed logicals	2023-02-18 01:23:47 +00:00
Charles Schlosser	94b19dc5f2	Add CArg	2023-02-15 21:33:06 +00:00
Rasmus Munk Larsen	77b48c440e	Fix compiler warnings.	2023-02-10 20:46:23 +00:00
Rasmus Munk Larsen	e4f58816d9	Get rid of custom implementation of equal_to and not_equal_no. No longer needed with c+14.	2023-02-07 21:36:44 -08:00
Charles Schlosser	6d9f662a70	Tweak atan2	2023-01-26 17:38:21 +00:00
Sean McBride	d70b4864d9	issue #2581 : review and cleanup of compiler version checks	2023-01-17 18:58:34 +00:00
Alexander Richardson	37de432907	Avoid using std::raise() for divide by zero	2022-12-14 20:06:16 +00:00
Charles Schlosser	6d3e3678b4	optimize equalspace packetop	2022-12-13 01:22:25 +00:00
Charles Schlosser	2004831941	add EqualSpaced / setEqualSpaced	2022-12-13 00:54:57 +00:00
Rasmus Munk Larsen	3bb6a48d8c	Fix bug atan2	2022-10-12 23:49:32 +00:00
Rasmus Munk Larsen	14c847dc0e	Refactor special values test for pow, and add a similar test for atan2	2022-10-12 20:12:08 +00:00
Rasmus Munk Larsen	462758e8a3	Don't use generic sign function for sign(complex) unless it is vectorizable	2022-10-12 16:03:29 +00:00
Rasmus Munk Larsen	c0d6a72611	Use pnegate(pzero(x)) as a generic way to generate -0.0. Some compiler do not handle the literal -0.0 properly in fastmath mode.	2022-10-12 01:57:05 +00:00
Rasmus Munk Larsen	3167544873	Handle NaN inputs to atan2.	2022-10-10 19:36:36 -07:00
Rasmus Munk Larsen	72db3f0fa5	Remove references to M_PI_2 and M_PI_4.	2022-10-11 00:27:16 +00:00
Antonio Sánchez	80efbfdeda	Unconditionally enable CXX11 math.	2022-10-04 17:37:47 +00:00
Rasmus Munk Larsen	1e1848fdb1	Add a vectorized implementation of atan2 to Eigen.	2022-09-28 20:46:49 +00:00
Rasmus Munk Larsen	7b2901e2aa	Add vectorized integer division for int32 with AVX512, AVX or SSE.	2022-09-21 00:27:23 +00:00
Rasmus Munk Larsen	273e0c884e	Revert "Add constexpr, test for C++14 constexpr."	2022-09-16 21:14:29 +00:00
Rasmus Munk Larsen	afc014f1b5	Allow mixed types for pow(), as long as the exponent is exactly representable in the base type.	2022-09-12 21:55:30 +00:00
Rasmus Munk Larsen	e8a2aa24a2	Fix a couple of issues with unary pow():	2022-09-09 17:21:11 +00:00
Tobias Schlüter	133498c329	Add constexpr, test for C++14 constexpr.	2022-09-07 03:42:34 +00:00
Antonio Sánchez	30c42222a6	Fix some test build errors in new unary pow.	2022-08-30 17:24:14 +00:00
Charles Schlosser	e5af9f87f2	Vectorize pow for integer base / exponent types	2022-08-29 19:23:54 +00:00
chuckyschluz	8acbf5c11c	re-enable pow for complex types	2022-08-26 17:29:02 -04:00
Charles Schlosser	76a669fb45	add fixed power unary operation	2022-08-16 21:32:36 +00:00
Rasmus Munk Larsen	97e0784dc6	Vectorize the sign operator in Eigen.	2022-08-09 19:54:57 +00:00
Tobias Schlüter	f3ba220c5d	Remove EIGEN_EMPTY_STRUCT_CTOR	2022-04-08 18:27:26 +00:00
Rasmus Munk Larsen	ea2c02060c	Add reciprocal packet op and fast specializations for float with SSE, AVX, and AVX512.	2022-01-21 23:49:18 +00:00
Rasmus Munk Larsen	a30ecb7221	Don't use the fast implementation if EIGEN_GPU_CC, since integer_packet is not defined for float4 used by the GPU compiler (even on host).	2022-01-12 20:16:16 +00:00
Rasmus Munk Larsen	0b58738938	Fix two corner cases in the new implementation of logistic sigmoid.	2022-01-12 00:41:29 +00:00
Rasmus Munk Larsen	80ccacc717	Fix accuracy of logistic sigmoid	2022-01-08 00:15:14 +00:00
Rasmus Munk Larsen	8b8125c574	Make sure the scalar and vectorized path for array.exp() return consistent values.	2022-01-07 23:31:35 +00:00
Erik Schultheis	ec2fd0f7ed	Require recent GCC and MSCV and removed `EIGEN_HAS_CXX14` and some other feature test macros	2021-12-01 00:48:34 +00:00
Erik Schultheis	f33a31b823	removed EIGEN_HAS_CXX11_* and redundant EIGEN_COMP_CXXVER checks	2021-11-29 19:18:57 +00:00
sciencewhiz	4b6036e276	fix various typos	2021-09-22 16:15:06 +00:00
Rasmus Munk Larsen	d7d0bf832d	Issue an error in case of direct inclusion of internal headers.	2021-09-10 19:12:26 +00:00
Antonio Sanchez	7880f10526	Enable equality comparisons on GPU. Since `std::equal_to::operator()` is not a device function, it fails on GPU. On my device, I seem to get a silent crash in the kernel (no reported error, but the kernel does not complete). Replacing this with a portable version enables comparisons on device. Addresses #2292 - would need to be cherry-picked. The 3.3 branch also requires adding `EIGEN_DEVICE_FUNC` in `BooleanRedux.h` to get fully working.	2021-08-03 01:53:31 +00:00
Antonio Sanchez	de2e62c62d	Disable vectorization of comparisons except for bool. Packet input/output types must currently be the same, and since these have a return type of `bool`, vectorization will only work if input is bool.	2021-07-25 13:39:50 -07:00
derekjchow	66ca41bd47	Add support for vectorizing logical comparisons.	2021-07-23 20:07:48 +00:00
Rasmus Munk Larsen	f64b2954c7	Fix c++20 warnings about using enums in arithmetic expressions.	2021-06-10 17:17:39 -07:00
Nathan Luehr	6753f0f197	Fix ambiguity due to argument dependent lookup.	2021-05-11 15:41:11 -05:00
Antonio Sanchez	2468253c9a	Define EIGEN_CPLUSPLUS and replace most __cplusplus checks. The macro `__cplusplus` is not defined correctly in MSVC unless building with the the `/Zc:__cplusplus` flag. Instead, it defines `_MSVC_LANG` to the specified c++ standard version number. Here we introduce `EIGEN_CPLUSPLUS` which will contain the c++ version number both for MSVC and otherwise. This simplifies checks for supported features. Also replaced most instances of standard version checking via `__cplusplus` with the existing `EIGEN_COMP_CXXVER` macro for better clarity. Fixes: #2170	2021-03-05 18:33:18 +00:00
Rasmus Munk Larsen	113e61f364	Remove unused function scalar_cmp_with_cast.	2021-02-24 23:59:35 +00:00
Antonio Sanchez	abcde69a79	Disable vectorized pow for half/bfloat16. We are potentially seeing some accuracy issues with these. Ideally we would hand off to `float`, but that's not trivial with the current setup. We may want to consider adding `ppow<Packet>` and `HasPow`, so implementations can more easily specialize this.	2021-02-05 12:17:34 -08:00
Antonio Sanchez	f0e46ed5d4	Fix pow and other cwise ops for half/bfloat16. The new `generic_pow` implementation was failing for half/bfloat16 since their construction from int/float is not `constexpr`. Modified in `GenericPacketMathFunctions` to remove `constexpr`. While adding tests for half/bfloat16, found other issues related to implicit conversions. Also needed to implement `numext::arg` for non-integer, non-complex, non-float/double/long double types. These seem to be implicitly converted to `std::complex<T>`, which then fails for half/bfloat16.	2021-01-22 11:10:54 -08:00
Rasmus Munk Larsen	cdd8fdc32e	Vectorize `pow(x, y)`. This closes https://gitlab.com/libeigen/eigen/-/issues/2085 , which also contains a description of the algorithm. I ran some testing (comparing to `std::pow(double(x), double(y)))` for `x` in the set of all (positive) floats in the interval `[std::sqrt(std::numeric_limits<float>::min()), std::sqrt(std::numeric_limits<float>::max())]`, and `y` in `{2, sqrt(2), -sqrt(2)}` I get the following error statistics: ``` max_rel_error = 8.34405e-07 rms_rel_error = 2.76654e-07 ``` If I widen the range to all normal float I see lower accuracy for arguments where the result is subnormal, e.g. for `y = sqrt(2)`: ``` max_rel_error = 0.666667 rms = 6.8727e-05 count = 1335165689 argmax = 2.56049e-32, 2.10195e-45 != 1.4013e-45 ``` which seems reasonable, since these results are subnormals with only couple of significant bits left.	2021-01-18 13:25:16 +00:00
Antonio Sanchez	bde6741641	Improved std::complex sqrt and rsqrt. Replaces `std::sqrt` with `complex_sqrt` for all platforms (previously `complex_sqrt` was only used for CUDA and MSVC), and implements custom `complex_rsqrt`. Also introduces `numext::rsqrt` to simplify implementation, and modified `numext::hypot` to adhere to IEEE IEC 6059 for special cases. The `complex_sqrt` and `complex_rsqrt` implementations were found to be significantly faster than `std::sqrt<std::complex<T>>` and `1/numext::sqrt<std::complex<T>>`. Benchmark file attached. ``` GCC 10, Intel Xeon, x86_64: --------------------------------------------------------------------------- Benchmark Time CPU Iterations --------------------------------------------------------------------------- BM_Sqrt<std::complex<float>> 9.21 ns 9.21 ns 73225448 BM_StdSqrt<std::complex<float>> 17.1 ns 17.1 ns 40966545 BM_Sqrt<std::complex<double>> 8.53 ns 8.53 ns 81111062 BM_StdSqrt<std::complex<double>> 21.5 ns 21.5 ns 32757248 BM_Rsqrt<std::complex<float>> 10.3 ns 10.3 ns 68047474 BM_DivSqrt<std::complex<float>> 16.3 ns 16.3 ns 42770127 BM_Rsqrt<std::complex<double>> 11.3 ns 11.3 ns 61322028 BM_DivSqrt<std::complex<double>> 16.5 ns 16.5 ns 42200711 Clang 11, Intel Xeon, x86_64: --------------------------------------------------------------------------- Benchmark Time CPU Iterations --------------------------------------------------------------------------- BM_Sqrt<std::complex<float>> 7.46 ns 7.45 ns 90742042 BM_StdSqrt<std::complex<float>> 16.6 ns 16.6 ns 42369878 BM_Sqrt<std::complex<double>> 8.49 ns 8.49 ns 81629030 BM_StdSqrt<std::complex<double>> 21.8 ns 21.7 ns 31809588 BM_Rsqrt<std::complex<float>> 8.39 ns 8.39 ns 82933666 BM_DivSqrt<std::complex<float>> 14.4 ns 14.4 ns 48638676 BM_Rsqrt<std::complex<double>> 9.83 ns 9.82 ns 70068956 BM_DivSqrt<std::complex<double>> 15.7 ns 15.7 ns 44487798 Clang 9, Pixel 2, aarch64: --------------------------------------------------------------------------- Benchmark Time CPU Iterations --------------------------------------------------------------------------- BM_Sqrt<std::complex<float>> 24.2 ns 24.1 ns 28616031 BM_StdSqrt<std::complex<float>> 104 ns 103 ns 6826926 BM_Sqrt<std::complex<double>> 31.8 ns 31.8 ns 22157591 BM_StdSqrt<std::complex<double>> 128 ns 128 ns 5437375 BM_Rsqrt<std::complex<float>> 31.9 ns 31.8 ns 22384383 BM_DivSqrt<std::complex<float>> 99.2 ns 98.9 ns 7250438 BM_Rsqrt<std::complex<double>> 46.0 ns 45.8 ns 15338689 BM_DivSqrt<std::complex<double>> 119 ns 119 ns 5898944 ```	2021-01-17 08:50:57 -08:00
Antonio Sanchez	c6efc4e0ba	Replace M_LOG2E and M_LN2 with custom macros. For these to exist we would need to define `_USE_MATH_DEFINES` before `cmath` or `math.h` is first included. However, we don't control the include order for projects outside Eigen, so even defining the macro in `Eigen/Core` does not fix the issue for projects that end up including `<cmath>` before Eigen does (explicitly or transitively). To fix this, we define `EIGEN_LOG2E` and `EIGEN_LN2` ourselves.	2020-12-11 14:34:31 -08:00

1 2 3 4 5

206 Commits