202 Commits

Author SHA1 Message Date
Deven Desai
876f392c39 Updates corresponding to the latest round of PR feedback
The major changes are

1. Moving CUDA/PacketMath.h to GPU/PacketMath.h
2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h
3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h
    The above three changes effectively enable the Eigen "Packet" layer for the HIP platform

4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic")
5. Updating the "EIGEN_DEVICE_FUNC" marking in some places

The change has been tested on the HIP and CUDA platforms.
2018-07-11 10:39:54 -04:00
Deven Desai
38807a2575 merging updates from upstream 2018-07-11 09:17:33 -04:00
Deven Desai
b6cc0961b1 updates based on PR feedback
There are two major changes (and a few minor ones which are not listed here...see PR discussion for details)

1. Eigen::half implementations for HIP and CUDA have been merged.
This means that
- `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h`
- `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h`
- `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h`

After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install.

2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate.
- `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC || EIGEN_HIPCC)`
- `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH || EIGEN_HIP_DEVICE_COMPILE)`
- `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`
2018-06-14 10:21:54 -04:00
Deven Desai
d1d22ef0f4 syncing this fork with upstream 2018-06-13 12:09:52 -04:00
Andrea Bocci
f7124b3e46 Extend CUDA support to matrix inversion and selfadjointeigensolver 2018-06-11 18:33:24 +02:00
Deven Desai
8fbd47052b Adding support for using Eigen in HIP kernels.
This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs.

Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor)


Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.
2018-06-06 10:12:58 -04:00
Christoph Hertzberg
e5f9f4768f Avoid unnecessary C++11 dependency 2018-06-07 15:03:50 +02:00
nicolov
39c2cba810 Add a specialization of Eigen::numext::conj for std::complex<T> to be used when compiling a cuda kernel. This fixes the compilation of TensorFlow 1.4 with clang 6.0 used as CUDA compiler with libc++.
This follows the previous change in 2a69290ddb
, which mentions OSX (I guess because it uses libc++ too).
2018-04-13 22:29:10 +00:00
Gael Guennebaud
e43ca0320d bug #1520: workaround some -Wfloat-equal warnings by calling std::equal_to 2018-04-11 15:24:13 +02:00
Gael Guennebaud
e116f6847e bug #1521: avoid signalling NaN in hypot and make it std::complex<> friendly. 2018-04-04 13:47:23 +02:00
luz.paz
e3912f5e63 MIsc. source and comment typos
Found using `codespell` and `grep` from downstream FreeCAD
2018-03-11 10:01:44 -04:00
Yan Facai (颜发才)
42a8334668 ENH: exp supports complex type for cuda 2018-01-04 16:01:01 +08:00
Gael Guennebaud
cda47c42c2 Fix compilation in c++98 mode. 2017-07-17 21:08:20 +02:00
Gael Guennebaud
bbd97b4095 Add a EIGEN_NO_CUDA option, and introduce EIGEN_CUDACC and EIGEN_CUDA_ARCH aliases 2017-07-17 01:02:51 +02:00
Benoit Steiner
c5a241ab9b Merged in benoitsteiner/opencl (pull request PR-323)
Improved support for OpenCL
2017-07-07 16:27:33 +00:00
Benoit Steiner
c92faf9d84 Merged in mehdi_goli/upstr_benoit/HiperbolicOP (pull request PR-13)
Adding hyperbolic operations for sycl.

* Adding hyperbolic operations.

* Adding the hyperbolic operations for CPU as well.
2017-07-06 05:05:57 +00:00
Gael Guennebaud
561f777075 Fix a gcc7 warning about bool * bool in abs2 default implementation. 2017-06-27 12:05:17 +02:00
Gael Guennebaud
498aa95a8b bug #1424: add numext::abs specialization for unsigned integer types. 2017-06-09 11:53:49 +02:00
Ilya Biryukov
1c03d43a5c Fixed compilation with cuda-clang 2017-03-06 12:01:12 +01:00
Srinivas Vasudevan
e6c8b5500c Change comparisons to use Scalar instead of RealScalar. 2016-12-05 14:01:45 -08:00
Srinivas Vasudevan
218764ee1f Added support for expm1 in Eigen. 2016-12-02 14:13:01 -08:00
Mehdi Goli
79aa2b784e Adding sycl backend for TensorPadding.h; disbaling __unit128 for sycl in TensorIntDiv.h; disabling cashsize for sycl in tensorDeviceDefault.h; adding sycl backend for StrideSliceOP ; removing sycl compiler warning for creating an array of size 0 in CXX11Meta.h; cleaning up the sycl backend code. 2016-12-01 13:02:27 +00:00
Luke Iwanski
5159675c33 Added isnan, isfinite and isinf for SYCL device. Plus test for that. 2016-11-18 16:01:48 +00:00
Luke Iwanski
c5130dedbe Specialised basic math functions for SYCL device. 2016-11-17 11:47:13 +00:00
Benoit Steiner
2a69290ddb Added a specialization of Eigen::numext::real and Eigen::numext::imag for std::complex<T> to be used when compiling a cuda kernel. This is unfortunately necessary to be able to process complex numbers from a CUDA kernel on MacOS. 2016-09-22 15:52:23 -07:00
Benoit Steiner
50e3bbfc90 Calls x.imag() instead of imag(x) when x is a complex number since the former
is a constexpr while the later isn't. This fixes compilation errors triggered by nvcc on Mac.
2016-09-22 13:17:25 -07:00
Benoit Steiner
c0d56a543e Added several missing EIGEN_DEVICE_FUNC qualifiers 2016-09-14 14:06:21 -07:00
Benoit Steiner
5f50f12d2c Added the ability to compute the absolute value of a complex number on GPU, as well as a test to catch the problem. 2016-09-12 13:46:13 -07:00
Gael Guennebaud
68d1897e8a Make sure that our log1p implementation is called as a last resort only. 2016-08-26 15:30:55 +02:00
Gael Guennebaud
fe60856fed Add overload of numext::log1p for float/double in CUDA 2016-08-26 15:28:59 +02:00
Gael Guennebaud
a4c266f827 Factorize the 4 copies of tanh implementations, make numext::tanh consistent with array::tanh, enable fast tanh in fast-math mode only. 2016-08-23 14:23:08 +02:00
Gael Guennebaud
82147cefff Fix possible overflow and biais in integer random generator 2016-08-23 13:25:31 +02:00
Gael Guennebaud
d476cadbb8 bug #1247: fix regression in compilation of pow(integer,integer), and add respective unit tests. 2016-06-25 10:12:06 +02:00
Gael Guennebaud
7c6561485a merge PR 194 2016-06-23 15:29:57 +02:00
Benoit Steiner
b055590e91 Made log1p_impl usable inside a GPU kernel 2016-06-16 11:37:40 -07:00
Gael Guennebaud
396d9cfb6e Generalize expr.pow(scalar), pow(expr,scalar) and pow(scalar,expr).
Internal: scalar_pow_op (unary) is removed, and scalar_binary_pow_op is renamed scalar_pow_op.
2016-06-14 14:10:07 +02:00
Gael Guennebaud
5fdd703629 Enable mixing types in numext::pow 2016-06-10 15:58:04 +02:00
Benoit Steiner
ff4a289572 Cleaned up the fp16 code 2016-05-24 18:50:09 -07:00
Benoit Steiner
a5a3ba2b80 Avoid unnecessary float to double conversions 2016-05-23 17:16:09 -07:00
Christoph Hertzberg
718521d5cf Silenced several double-promotion warnings 2016-05-22 18:17:04 +02:00
Christoph Hertzberg
dacb469bc9 Enable and fix -Wdouble-conversion warnings 2016-05-05 13:35:45 +02:00
Benoit Steiner
c61170e87d fpclassify isn't portable enough. In particular, the return values of the function are not available on all the platforms Eigen supportes: remove it from Eigen. 2016-04-27 14:22:20 -07:00
Benoit Steiner
6744d776ba Added support for fpclassify in Eigen::Numext 2016-04-27 12:10:25 -07:00
Benoit Steiner
10b69810d1 Improved support for trigonometric functions on GPU 2016-04-13 16:00:51 -07:00
Benoit Steiner
bf3f6688f0 Added support for computing cos, sin, tan, and tanh on GPU. 2016-04-13 11:55:08 -07:00
Benoit Steiner
5da90fc8dd Use numext::abs instead of std::abs in scalar_fuzzy_default_impl to make it usable inside GPU kernels. 2016-04-08 19:40:48 -07:00
Benoit Steiner
01bd577288 Fixed the implementation of Eigen::numext::isfinite, Eigen::numext::isnan, andEigen::numext::isinf on CUDA devices 2016-04-08 16:40:10 -07:00
Benoit Steiner
89a3dc35a3 Fixed isfinite_impl: NumTraits<T>::highest() and NumTraits<T>::lowest() are finite numbers. 2016-04-08 15:56:16 -07:00
Benoit Steiner
b89d3f78b2 Updated the isnan, isinf and isfinite functions to make compatible with cuda devices. 2016-04-07 10:08:49 -07:00
Benoit Steiner
1108b4f218 Fixed the signature of numext::abs to make it compatible with complex numbers 2016-04-04 11:09:25 -07:00