Charles Schlosser
122befe54c
Fix "unary minus operator applied to unsigned type, result still unsigned" on MSVC and other stupid warnings
2024-04-12 19:35:04 +00:00
Charles Schlosser
86aee3d9c5
Fix long double random
2024-04-02 12:05:40 +00:00
Charles Schlosser
e63d9f6ccb
Fix random again
2024-03-29 21:49:27 +00:00
Antonio Sánchez
38fcedaf8e
Fix pexp complex test edge-cases.
2024-03-04 17:44:38 +00:00
Antonio Sánchez
a962a27594
Fix MSVC GPU build.
2024-02-27 23:26:06 +00:00
Antonio Sánchez
7a88cdd6ad
Fix signed integer UB in random.
2024-02-24 13:16:23 +00:00
Antonio Sánchez
6b365e74d6
Fix GPU build for ptanh_float.
2024-02-20 16:08:50 +00:00
Rasmus Munk Larsen
4d419e2209
Rename generic_fast_tanh_float to ptanh_float and move it to...
2024-02-16 21:27:22 +00:00
Antonio Sánchez
2a9055b50e
Fix random for custom scalars that don't have constexpr digits().
2024-02-16 02:30:54 +00:00
Charles Schlosser
431e4a913b
Fix the fuzz
2024-02-07 04:52:19 +00:00
Charles Schlosser
d626762e3f
improve random
2024-01-31 08:16:29 +00:00
Charles Schlosser
2c4541f735
fix msvc clz
2023-12-13 03:33:49 +00:00
Antonio Sánchez
75e273afcc
Add internal ctz/clz implementation.
2023-12-11 21:03:09 +00:00
Tobias Wood
f38e16c193
Apply clang-format
2023-11-29 11:12:48 +00:00
Kyle Macfarlan
5de0f2f89e
Fixes #2735 : Component-wise cbrt
2023-10-25 03:06:13 +00:00
Antonio Sánchez
0c9526912c
Pass div_ceil arguments by value.
2023-10-17 18:46:19 +00:00
Rasmus Munk Larsen
a96545777b
Consolidate multiple implementations of divup/div_up/div_ceil.
2023-10-10 17:16:59 +00:00
Antonio Sánchez
6e4d5d4832
Add IWYU private pragmas to internal headers.
2023-08-21 16:25:22 +00:00
Antonio Sánchez
63dcb429cd
Fix use of arg function in CUDA.
2023-07-07 18:37:14 +00:00
Charles Schlosser
44c20bbbe3
rint round floor ceil
2023-06-23 16:29:16 +00:00
Charles Schlosser
59b3ef5409
Partially Vectorize Cast
2023-06-09 16:54:31 +00:00
Antonio Sánchez
2d0c6ad873
Revert "Vectorize cast"
...
This reverts commit eb5ff1861a4783876564a1a79573c3b9ff566863
2023-04-26 18:03:36 +00:00
Charles Schlosser
eb5ff1861a
Vectorize cast
2023-04-26 02:50:13 +00:00
Rasmus Munk Larsen
f02856c640
Use EIGEN_NOT_A_MACRO macro (oh the irony!) to avoid build issue in TensorFlow.
2023-03-15 11:42:57 -07:00
Rasmus Munk Larsen
690ae9502f
Use C++11 standard features for detecting presence of Inf and NaN
2023-03-15 16:52:44 +00:00
Antonio Sánchez
bc5cdc7a67
Guard use of long double on GPU device.
2023-02-24 21:49:59 +00:00
Charles Schlosser
6d9f662a70
Tweak atan2
2023-01-26 17:38:21 +00:00
Charles Schlosser
82b152dbe7
Add signbit function
2022-11-04 00:31:20 +00:00
Antonio Sánchez
80efbfdeda
Unconditionally enable CXX11 math.
2022-10-04 17:37:47 +00:00
Rasmus Munk Larsen
273e0c884e
Revert "Add constexpr, test for C++14 constexpr."
2022-09-16 21:14:29 +00:00
Tobias Schlüter
133498c329
Add constexpr, test for C++14 constexpr.
2022-09-07 03:42:34 +00:00
Rasmus Munk Larsen
7064ed1345
Specialize psign<Packet8i> for AVX2, don't vectorize psign<bool>.
2022-08-26 17:02:37 +00:00
Rasmus Munk Larsen
97e0784dc6
Vectorize the sign operator in Eigen.
2022-08-09 19:54:57 +00:00
Erik Schultheis
421cbf0866
Replace Eigen type metaprogramming with corresponding std types and make use of alias templates
2022-03-16 16:43:40 +00:00
Erik Schultheis
c20e908ebc
turn some macros intro constexpr functions
2021-12-10 19:27:01 +00:00
Erik Schultheis
ec2fd0f7ed
Require recent GCC and MSCV and removed EIGEN_HAS_CXX14
and some other feature test macros
2021-12-01 00:48:34 +00:00
Rasmus Munk Larsen
6cadab6896
Clean up EIGEN_STATIC_ASSERT to only use standard c++11 static_assert.
2021-09-16 20:43:54 +00:00
Rasmus Munk Larsen
d7d0bf832d
Issue an error in case of direct inclusion of internal headers.
2021-09-10 19:12:26 +00:00
Antonio Sanchez
2b410ecbef
Workaround VS 2017 arg bug.
...
In VS 2017, `std::arg` for real inputs always returns 0, even for
negative inputs. It should return `PI` for negative real values.
This seems to be fixed in VS 2019 (MSVC 1920).
2021-08-18 18:39:18 +00:00
Nathan Luehr
7e6a1c129c
Device implementation of log for std::complex types.
2021-05-11 22:02:21 +00:00
Rohit Santhanam
39ec31c0ad
Fix for issue where numext::imag and numext::real are used before they are defined.
2021-05-10 19:48:32 +00:00
Antonio Sanchez
c0eb5f89a4
Restore ABI compatibility for conj with 3.3, fix conflict with boost.
...
The boost library unfortunately specializes `conj` for various types and
assumes the original two-template-parameter version. This changes
restores the second parameter. This also restores ABI compatibility.
The specialization for `std::complex` is because `std::conj` is not
a device function. For custom complex scalar types, users should provide
their own `conj` implementation.
We may consider removing the unnecessary second parameter in the future - but
this will require modifying boost as well.
Fixes #2112 .
2021-05-07 18:14:00 +00:00
Antonio Sanchez
90e9a33e1c
Fix numext::arg return type.
...
The cxx11 path for `numext::arg` incorrectly returned the complex type
instead of the real type, leading to compile errors. Fixed this and
added tests.
Related to !477 , which uncovered the issue.
2021-05-07 16:26:57 +00:00
Antonio Sanchez
87729ea39f
Eliminate round_impl
double-promotion warnings for c++03.
2021-03-25 16:52:19 +00:00
Steve Bronder
e7b8643d70
Revert "Revert "Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), innerStride(), outerStride(), and size()""
...
This reverts commit 5f0b4a4010af4cbf6161a0d1a03a747addc44a5d.
2021-03-24 18:14:56 +00:00
Antonio Sanchez
14b7ebea11
Fix numext::round pre c++11 for large inputs.
...
This is to resolve an issue for large inputs when +0.5 can
actually lead to +1 if the input doesn't have enough precision
to resolve the addition - leading to an off-by-one error.
See discussion on 9a663973.
2021-03-15 19:08:04 +00:00
David Tellenbach
5f0b4a4010
Revert "Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), innerStride(), outerStride(), and size()"
...
This reverts commit 6cbb3038ac48cb5fe17eba4dfbf26e3e798041f1 because it
breaks clang-10 builds on x86 and aarch64 when C++11 is enabled.
2021-03-05 13:16:43 +01:00
Steve Bronder
6cbb3038ac
Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), innerStride(), outerStride(), and size()
2021-03-04 18:58:08 +00:00
Antonio Sanchez
f0e46ed5d4
Fix pow and other cwise ops for half/bfloat16.
...
The new `generic_pow` implementation was failing for half/bfloat16 since
their construction from int/float is not `constexpr`. Modified
in `GenericPacketMathFunctions` to remove `constexpr`.
While adding tests for half/bfloat16, found other issues related to
implicit conversions.
Also needed to implement `numext::arg` for non-integer, non-complex,
non-float/double/long double types. These seem to be implicitly
converted to `std::complex<T>`, which then fails for half/bfloat16.
2021-01-22 11:10:54 -08:00
Antonio Sanchez
bde6741641
Improved std::complex sqrt and rsqrt.
...
Replaces `std::sqrt` with `complex_sqrt` for all platforms (previously
`complex_sqrt` was only used for CUDA and MSVC), and implements
custom `complex_rsqrt`.
Also introduces `numext::rsqrt` to simplify implementation, and modified
`numext::hypot` to adhere to IEEE IEC 6059 for special cases.
The `complex_sqrt` and `complex_rsqrt` implementations were found to be
significantly faster than `std::sqrt<std::complex<T>>` and
`1/numext::sqrt<std::complex<T>>`.
Benchmark file attached.
```
GCC 10, Intel Xeon, x86_64:
---------------------------------------------------------------------------
Benchmark Time CPU Iterations
---------------------------------------------------------------------------
BM_Sqrt<std::complex<float>> 9.21 ns 9.21 ns 73225448
BM_StdSqrt<std::complex<float>> 17.1 ns 17.1 ns 40966545
BM_Sqrt<std::complex<double>> 8.53 ns 8.53 ns 81111062
BM_StdSqrt<std::complex<double>> 21.5 ns 21.5 ns 32757248
BM_Rsqrt<std::complex<float>> 10.3 ns 10.3 ns 68047474
BM_DivSqrt<std::complex<float>> 16.3 ns 16.3 ns 42770127
BM_Rsqrt<std::complex<double>> 11.3 ns 11.3 ns 61322028
BM_DivSqrt<std::complex<double>> 16.5 ns 16.5 ns 42200711
Clang 11, Intel Xeon, x86_64:
---------------------------------------------------------------------------
Benchmark Time CPU Iterations
---------------------------------------------------------------------------
BM_Sqrt<std::complex<float>> 7.46 ns 7.45 ns 90742042
BM_StdSqrt<std::complex<float>> 16.6 ns 16.6 ns 42369878
BM_Sqrt<std::complex<double>> 8.49 ns 8.49 ns 81629030
BM_StdSqrt<std::complex<double>> 21.8 ns 21.7 ns 31809588
BM_Rsqrt<std::complex<float>> 8.39 ns 8.39 ns 82933666
BM_DivSqrt<std::complex<float>> 14.4 ns 14.4 ns 48638676
BM_Rsqrt<std::complex<double>> 9.83 ns 9.82 ns 70068956
BM_DivSqrt<std::complex<double>> 15.7 ns 15.7 ns 44487798
Clang 9, Pixel 2, aarch64:
---------------------------------------------------------------------------
Benchmark Time CPU Iterations
---------------------------------------------------------------------------
BM_Sqrt<std::complex<float>> 24.2 ns 24.1 ns 28616031
BM_StdSqrt<std::complex<float>> 104 ns 103 ns 6826926
BM_Sqrt<std::complex<double>> 31.8 ns 31.8 ns 22157591
BM_StdSqrt<std::complex<double>> 128 ns 128 ns 5437375
BM_Rsqrt<std::complex<float>> 31.9 ns 31.8 ns 22384383
BM_DivSqrt<std::complex<float>> 99.2 ns 98.9 ns 7250438
BM_Rsqrt<std::complex<double>> 46.0 ns 45.8 ns 15338689
BM_DivSqrt<std::complex<double>> 119 ns 119 ns 5898944
```
2021-01-17 08:50:57 -08:00