3692 Commits

Author SHA1 Message Date
Gael Guennebaud
2ca2592009 Fix determination of EIGEN_HAS_TYPE_TRAITS 2018-07-19 18:47:18 +02:00
Alexey Frunze
1f523e7304 Add MIPS changes missing from previous merge. 2018-07-18 12:27:50 -07:00
Eugene Zhulenev
086ded5c85 Disable type traits for GCC < 5.1.0 2018-07-18 16:32:55 -07:00
Gael Guennebaud
863580fe88 bug #1432: fix conservativeResize for non-relocatable scalar types. For those we need to by-pass realloc routines and fall-back to allocate as new - copy - delete. The remaining problem is that we don't have any mechanism to accurately determine whether a type is relocatable or not, so currently let's be super conservative using either RequireInitialization or std::is_trivially_copyable 2018-07-18 23:33:07 +02:00
Deven Desai
f124f07965 applying EIGEN_DECLARE_TEST to *gpu* tests
Also, a few minor fixes for GPU tests running in HIP mode.

1. Adding an include for hip/hip_runtime.h in the Macros.h file
   For HIP __host__ and __device__ are macros which are defined in hip headers.
   Their definitions need to be included before their use in the file.

2. Fixing the compile failure in TensorContractionGpu introduced by the commit to
   "Fuse computations into the Tensor contractions using output kernel"

3. Fixing a HIP/clang specific compile error by making the struct-member assignment explicit
2018-07-17 14:16:48 -04:00
Gael Guennebaud
40797dbea3 bug #1572: use c++11 atomic instead of volatile if c++11 is available, and disable multi-threaded GEMM on non-x86 without c++11. 2018-07-17 00:11:20 +02:00
Rasmus Munk Larsen
4a3952fd55 Relax the condition to not only work on Android. 2018-07-13 11:24:07 -07:00
Rasmus Munk Larsen
02a9443db9 Clang produces incorrect Thumb2 assembler when using alloca.
Don't define EIGEN_ALLOCA when generating Thumb with clang.
2018-07-13 11:03:04 -07:00
Gael Guennebaud
20991c3203 bug #1571: fix is_convertible<from,to> with "from" a reference. 2018-07-13 17:47:28 +02:00
Gael Guennebaud
86d9c0255c Forward declaring std::array does not work with all std libs, so let's just include <array> 2018-07-13 13:06:44 +02:00
Alexey Frunze
3875fb05aa Add support for MIPS SIMD (MSA) 2018-07-06 16:04:30 -07:00
Gael Guennebaud
98728312c8 Fix compilation regarding std::array 2018-07-12 17:00:37 +02:00
Gael Guennebaud
eb3d8f68bb fix unused warning 2018-07-12 16:59:47 +02:00
Gael Guennebaud
006e18e52b Cleanup the mess in Eigen/Core by moving CUDA/HIP stuff at more appropriate places (Macros.h),
and alignment/vectorization logic is now in util/ConfigureVectorization.h
2018-07-12 16:57:41 +02:00
Gael Guennebaud
8bdb214fd0 remove double ;; 2018-07-12 11:17:53 +02:00
Gael Guennebaud
a9060378d3 bug #1570: fix warning 2018-07-12 11:07:09 +02:00
Gael Guennebaud
da0c604078 Merged in deven-amd/eigen (pull request PR-402)
Adding support for using Eigen in HIP kernels.
2018-07-12 08:07:16 +00:00
Gael Guennebaud
a4ea611ca7 Remove useless specialization thanks to is_convertible being more robust. 2018-07-12 09:59:44 +02:00
Gael Guennebaud
8ef267ccbd spellcheck 2018-07-12 09:58:29 +02:00
Gael Guennebaud
21cf4a1a8b Make is_convertible more robust and conformant to std::is_convertible 2018-07-12 09:57:19 +02:00
Gael Guennebaud
d193cc87f4 Fix regression in 9357838f94d2907996adadc7e5200376f3561ed4 2018-07-11 17:09:23 +02:00
Gael Guennebaud
fb33687736 Fix double ;; 2018-07-11 17:08:30 +02:00
Deven Desai
876f392c39 Updates corresponding to the latest round of PR feedback
The major changes are

1. Moving CUDA/PacketMath.h to GPU/PacketMath.h
2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h
3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h
    The above three changes effectively enable the Eigen "Packet" layer for the HIP platform

4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic")
5. Updating the "EIGEN_DEVICE_FUNC" marking in some places

The change has been tested on the HIP and CUDA platforms.
2018-07-11 10:39:54 -04:00
Deven Desai
471cfe5ff7 renaming CUDA* to GPU* for some header files 2018-07-11 09:22:04 -04:00
Deven Desai
38807a2575 merging updates from upstream 2018-07-11 09:17:33 -04:00
Gael Guennebaud
1625476091 Add internall::is_identity compile-time helper 2018-07-11 14:00:24 +02:00
Gael Guennebaud
fe723d6129 Fix conversion warning 2018-07-10 09:10:32 +02:00
Gael Guennebaud
9357838f94 bug #1543: improve linear indexing for general block expressions 2018-07-10 09:10:15 +02:00
Gael Guennebaud
de9e31a06d Introduce the macro ei_declare_local_nested_eval to help allocating on the stack local temporaries via alloca, and let outer-products makes a good use of it.
If successful, we should use it everywhere nested_eval is used to declare local dense temporaries.
2018-07-09 15:41:14 +02:00
Gael Guennebaud
ec323b7e66 Skip null numerators in triangular-vector-solve (as in BLAS TRSV). 2018-07-09 11:13:19 +02:00
Gael Guennebaud
359dd77ec3 Fix legitimate "declaration shadows a typedef" warning 2018-07-09 11:03:39 +02:00
Mark D Ryan
90a53ca6fd Fix the Packet16h version of ptranspose
The AVX512 version of ptranpose for PacketBlock<Packet16h,16> was
reordering the PacketBlock argument incorrectly.  This lead to errors in
the multiplication of matrices composed of 16 bit floats on AVX512
machines, if at least of the matrices was using RowMajor order.  This
error is responsible for one tensorflow unit test failure on AVX512
machines:

//tensorflow/python/kernel_tests:batch_matmul_op_test
2018-06-16 15:13:06 -07:00
Gael Guennebaud
1f54164eca Fix a few issues with Packet16h 2018-07-07 00:15:07 +02:00
Gael Guennebaud
f2dc048df9 complete implementation of Packet16h (AVX512) 2018-07-06 17:43:11 +02:00
Gael Guennebaud
f4d623ffa7 Complete Packet8h implementation and test it in packetmath unit test 2018-07-06 17:13:36 +02:00
Deven Desai
b6cc0961b1 updates based on PR feedback
There are two major changes (and a few minor ones which are not listed here...see PR discussion for details)

1. Eigen::half implementations for HIP and CUDA have been merged.
This means that
- `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h`
- `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h`
- `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h`

After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install.

2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate.
- `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC || EIGEN_HIPCC)`
- `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH || EIGEN_HIP_DEVICE_COMPILE)`
- `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`
2018-06-14 10:21:54 -04:00
Deven Desai
ba972fb6b4 moving Half headers from CUDA dir to GPU dir, removing the HIP versions 2018-06-13 12:26:18 -04:00
Deven Desai
d1d22ef0f4 syncing this fork with upstream 2018-06-13 12:09:52 -04:00
Benoit Steiner
d3a380af4d Merged in mfigurnov/eigen/gamma-der-a (pull request PR-403)
Derivative of the incomplete Gamma function and the sample of a Gamma random variable

Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>
2018-06-11 17:57:47 +00:00
Andrea Bocci
f7124b3e46 Extend CUDA support to matrix inversion and selfadjointeigensolver 2018-06-11 18:33:24 +02:00
Gael Guennebaud
0537123953 bug #1565: help MSVC to generatenot too bad ASM in reductions. 2018-07-05 09:21:26 +02:00
Gael Guennebaud
3ae2083e23 Make is_same_dense compatible with different scalar types. 2018-07-03 13:21:43 +02:00
Gael Guennebaud
d625564936 Simplify redux_evaluator using inheritance, and properly rename parameters in reducers. 2018-07-02 11:50:41 +02:00
Gael Guennebaud
d428a199ab bug #1562: optimize evaluation of small products of the form s*A*B by rewriting them as: s*(A.lazyProduct(B)) to save a costly temporary. Measured speedup from 2x to 5x... 2018-07-02 11:41:09 +02:00
Gael Guennebaud
9a81de1d35 Fix order of EIGEN_DEVICE_FUNC and returned type 2018-06-28 00:20:59 +02:00
Gael Guennebaud
ee5864f72e bug #1560 fix product with a 1x1 diagonal matrix 2018-06-25 10:30:12 +02:00
Rasmus Munk Larsen
bda71ad394 Fix typo in pbend for AltiVec. 2018-06-22 15:04:35 -07:00
Gael Guennebaud
d6813fb1c5 bug #1531: expose NumDimensions for solve and sparse expressions. 2018-06-08 16:55:10 +02:00
Gael Guennebaud
89d65bb9d6 bug #1531: expose NumDimensions for compatibility with Tensor 2018-06-08 16:50:17 +02:00
Benoit Steiner
522d3ca54d Don't use std::equal_to inside cuda kernels since it's not supported. 2018-06-07 13:02:07 -07:00