Deven Desai
b6cc0961b1
updates based on PR feedback
...
There are two major changes (and a few minor ones which are not listed here...see PR discussion for details)
1. Eigen::half implementations for HIP and CUDA have been merged.
This means that
- `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h`
- `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h`
- `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h`
After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install.
2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate.
- `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC || EIGEN_HIPCC)`
- `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH || EIGEN_HIP_DEVICE_COMPILE)`
- `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`
2018-06-14 10:21:54 -04:00
Andrea Bocci
f7124b3e46
Extend CUDA support to matrix inversion and selfadjointeigensolver
2018-06-11 18:33:24 +02:00
Deven Desai
8fbd47052b
Adding support for using Eigen in HIP kernels.
...
This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs.
Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor)
Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.
2018-06-06 10:12:58 -04:00
Gael Guennebaud
647b724a36
Define pcast<> for SSE types even when AVX is enabled. (otherwise float are silently reinterpreted as int instead of being converted)
2018-05-29 20:46:46 +02:00
Gael Guennebaud
40b4bf3d32
AVX512: _mm512_rsqrt28_ps is available for AVX512ER only
2018-04-03 14:36:27 +02:00
luz.paz
e3912f5e63
MIsc. source and comment typos
...
Found using `codespell` and `grep` from downstream FreeCAD
2018-03-11 10:01:44 -04:00
nluehr
f9bdcea022
For cuda 9.1 replace math_functions.hpp with cuda_runtime.h
2017-12-18 16:51:15 -08:00
Benoit Steiner
a4089991eb
Added support for CUDA 9.0.
2017-08-31 02:49:39 +00:00
Gael Guennebaud
21633e585b
bug #1462 : remove all occurences of the deprecated __CUDACC_VER__ macro by introducing EIGEN_CUDACC_VER
2017-08-24 11:06:47 +02:00
Gael Guennebaud
b0f55ef85a
merge
2017-02-21 17:04:10 +01:00
Gael Guennebaud
3d200257d7
Add support for automatic-size deduction in reshaped, e.g.:
...
mat.reshaped(4,AutoSize); <-> mat.reshaped(4,mat.size()/4);
2017-02-21 15:57:25 +01:00
Gael Guennebaud
e7ebe52bfb
bug #1391 : include IO.h before DenseBase to enable its usage in DenseBase plugins.
2017-02-13 09:46:20 +01:00
Gael Guennebaud
24409f3acd
Use fix<> API to specify compile-time reshaped sizes.
2017-01-29 15:20:35 +01:00
Gael Guennebaud
9036cda364
Cleanup intitial reshape implementation:
...
- reshape -> reshaped
- make it compatible with evaluators.
2017-01-29 14:57:45 +01:00
Gael Guennebaud
0e89baa5d8
import yoco xiao's work on reshape
2017-01-29 14:29:31 +01:00
Gael Guennebaud
25a1703579
Merged in ggael/eigen-flexidexing (pull request PR-294)
...
generalized operator() for indexed access and slicing
2017-01-26 08:04:23 +00:00
Gael Guennebaud
b0db4eff36
bug #1382 : move using std::size_t/ptrdiff_t to Eigen's namespace (still better than the global namespace!)
2017-01-23 22:03:57 +01:00
Gael Guennebaud
7691723e34
Add support for fixed-value in symbolic expression, c++11 only for now.
2017-01-19 19:25:29 +01:00
Benoit Steiner
924600a0e8
Made sure that enabling avx2 instructions enables avx and sse instructions as well.
2017-01-19 09:54:48 -08:00
Gael Guennebaud
4989922be2
Add support for symbolic expressions as arguments of operator()
2017-01-16 22:21:23 +01:00
Gael Guennebaud
752bd92ba5
Large code refactoring:
...
- generalize some utilities and move them to Meta (size(), array_size())
- move handling of all and single indices to IndexedViewHelper.h
- several cleanup changes
2017-01-11 17:24:02 +01:00
Gael Guennebaud
b1dc0fa813
Move fix and symbolic to their own file, and improve doxygen compatibility
2017-01-11 14:28:28 +01:00
Gael Guennebaud
ac7e4ac9c0
Initial commit to add a generic indexed-based view of matrices.
...
This version already works as a read-only expression.
Numerous refactoring, renaming, extension, tuning passes are expected...
2017-01-06 00:01:44 +01:00
Gael Guennebaud
3182bdbae6
Disable vectorization when compiled by nvcc, even is EIGEN_NO_CUDA is defined
2017-07-17 11:01:28 +02:00
Gael Guennebaud
bbd97b4095
Add a EIGEN_NO_CUDA option, and introduce EIGEN_CUDACC and EIGEN_CUDA_ARCH aliases
2017-07-17 01:02:51 +02:00
Gael Guennebaud
b240080e64
bug #1436 : fix compilation of Jacobi rotations with ARM NEON, some specializations of internal::conj_helper were missing.
2017-06-15 10:16:30 +02:00
Benoit Steiner
fb1d0138ec
Include SSE packet instructions when compiling with avx512 enabled.
2016-12-19 07:32:48 -08:00
Benoit Steiner
81151bd474
Fixed merge conflicts
2016-11-19 19:12:59 -08:00
Benoit Steiner
7c30078b9f
Merged eigen/eigen into default
2016-11-17 22:53:37 -08:00
Luke Iwanski
c5130dedbe
Specialised basic math functions for SYCL device.
2016-11-17 11:47:13 +00:00
Benoit Steiner
f2e8b73256
Enable the use of AVX512 instruction by default
2016-11-16 21:28:04 -08:00
Benoit Steiner
d46a36cc84
Merged eigen/eigen into default
2016-11-04 18:22:55 -07:00
Mehdi Goli
0ebe3808ca
Removed the sycl include from Eigen/Core and moved it to Unsupported/Eigen/CXX11/Tensor; added TensorReduction for sycl (full reduction and partial reduction); added TensorReduction test case for sycl (full reduction and partial reduction); fixed the tile size on TensorSyclRun.h based on the device max work group size;
2016-11-04 18:18:19 +00:00
Benoit Steiner
5c3995769c
Improved AVX512 configuration
2016-11-03 04:50:28 -07:00
Benoit Steiner
ca0ba0d9a4
Improved AVX512 support
2016-11-03 04:00:49 -07:00
Benoit Steiner
c80587c92b
Merged eigen/eigen into default
2016-11-03 03:55:11 -07:00
Benoit Steiner
0585b2965d
Disable vectorization on device only when compiling for sycl
2016-11-02 11:44:27 -07:00
Benoit Steiner
cf20b30d65
Merge latest updates from trunk
2016-10-20 09:42:05 -07:00
Benoit Steiner
d3943cd50c
Fixed a few typos in the ternary tensor expressions types
2016-10-19 12:56:12 -07:00
Mehdi Goli
8fb162fc85
Fixing the typo regarding missing #if needed for proper handling of exceptions in Eigen/Core.
2016-10-16 12:52:34 +01:00
Mehdi Goli
15380f9a87
Applyiing Benoit's comment to return the missing line back in Eigen/Core
2016-10-14 16:39:41 +01:00
Mehdi Goli
524fa4c46f
Reducing the code by generalising sycl backend functions/structs.
2016-10-14 12:09:55 +01:00
Benoit Steiner
9f3276981c
Enabling AVX512 should also enable AVX2.
2016-10-06 10:29:48 -07:00
Benoit Steiner
78b569f685
Merged latest updates from trunk
2016-10-05 18:48:55 -07:00
Benoit Steiner
ae1385c7e4
Pull the latest updates from trunk
2016-10-05 14:54:36 -07:00
Benoit Steiner
409e887d78
Added support for constand std::complex numbers on GPU
2016-10-03 11:06:24 -07:00
RJ Ryan
b2c6dc48d9
Add CUDA-specific std::complex<T> specializations for scalar_sum_op, scalar_difference_op, scalar_product_op, and scalar_quotient_op.
2016-09-20 07:18:20 -07:00
Luke Iwanski
b91e021172
Merged with default.
2016-09-19 14:03:54 +01:00
Luke Iwanski
cb81975714
Partial OpenCL support via SYCL compatible with ComputeCpp CE.
2016-09-19 12:44:13 +01:00
Gael Guennebaud
a4c266f827
Factorize the 4 copies of tanh implementations, make numext::tanh consistent with array::tanh, enable fast tanh in fast-math mode only.
2016-08-23 14:23:08 +02:00