Gael Guennebaud
|
112c899304
|
comment unreachable code
|
2018-04-03 23:16:43 +02:00 |
|
Gael Guennebaud
|
40b4bf3d32
|
AVX512: _mm512_rsqrt28_ps is available for AVX512ER only
|
2018-04-03 14:36:27 +02:00 |
|
Gael Guennebaud
|
584951ca4d
|
Rename predux_downto4 to be more accurate on its semantic.
|
2018-04-03 14:28:38 +02:00 |
|
Gael Guennebaud
|
7b0630315f
|
AVX512: fix psqrt and prsqrt
|
2018-04-03 14:12:50 +02:00 |
|
Gael Guennebaud
|
6719409cd9
|
AVX512: add missing pinsertfirst and pinsertlast, implement pblend for Packet8d, fix compilation without AVX512DQ
|
2018-04-03 14:11:56 +02:00 |
|
luz.paz
|
e3912f5e63
|
MIsc. source and comment typos
Found using `codespell` and `grep` from downstream FreeCAD
|
2018-03-11 10:01:44 -04:00 |
|
Daniel Trebbien
|
0c57be407d
|
Move up the specialization of std::numeric_limits
This fixes a compilation error seen when building TensorFlow on macOS:
https://github.com/tensorflow/tensorflow/issues/17067
|
2018-02-18 15:35:45 -08:00 |
|
nluehr
|
aefd5fd5c4
|
Replace __float2half_rn with __float2half
The latter provides a consistent definition for CUDA 8.0 and 9.0.
|
2017-11-28 10:15:46 -08:00 |
|
nluehr
|
dd6de618c3
|
Fix incorrect integer cast in predux<half2>().
Bug corrupts results on Maxwell and earlier GPU architectures.
|
2017-11-21 10:47:00 -08:00 |
|
Christoph Hertzberg
|
11ddac57e5
|
Merged in guillaume_michel/eigen (pull request PR-334)
- Add support for NEON plog PacketMath function
|
2017-10-23 13:22:22 +00:00 |
|
Henry Schreiner
|
9bb26eb8f1
|
Restore __device__
|
2017-10-21 00:50:38 +00:00 |
|
Henry Schreiner
|
4245475d22
|
Fixing missing inlines on device functions for newer CUDA cards
|
2017-10-20 03:20:13 +00:00 |
|
Konstantinos Margaritis
|
6c3475f110
|
remove debugging
|
2017-10-12 15:34:55 -04:00 |
|
Konstantinos Margaritis
|
df7644aec3
|
Merged eigen/eigen into default
|
2017-10-12 22:23:13 +03:00 |
|
Konstantinos Margaritis
|
c4ad358565
|
explicitly set conjugate mask
|
2017-10-11 11:05:29 -04:00 |
|
Konstantinos Margaritis
|
380d41fd76
|
added some extra debugging
|
2017-10-11 10:40:12 -04:00 |
|
Konstantinos Margaritis
|
d0b7b9d0d3
|
some Packet2cf pmul fixes
|
2017-10-11 10:17:22 -04:00 |
|
Konstantinos Margaritis
|
df173f5620
|
initial pexp() for 32-bit floats, commented out due to vec_cts()
|
2017-10-11 09:40:49 -04:00 |
|
Konstantinos Margaritis
|
3dcae2a27f
|
initial pexp() for 32-bit floats, commented out due to vec_cts()
|
2017-10-11 09:40:45 -04:00 |
|
Konstantinos Margaritis
|
c2a2246489
|
fix predux_mul for z14/float
|
2017-10-10 13:38:32 -04:00 |
|
Konstantinos Margaritis
|
bc30305d29
|
complete z14 port
|
2017-10-09 16:55:10 -04:00 |
|
Gael Guennebaud
|
9c353dd145
|
Add C++11 max_digits10 for half.
|
2017-09-06 10:22:47 +02:00 |
|
Benoit Steiner
|
a4089991eb
|
Added support for CUDA 9.0.
|
2017-08-31 02:49:39 +00:00 |
|
Konstantinos Margaritis
|
1affe3d8df
|
Merged eigen/eigen into default
|
2017-08-24 12:24:01 +03:00 |
|
Gael Guennebaud
|
21633e585b
|
bug #1462: remove all occurences of the deprecated __CUDACC_VER__ macro by introducing EIGEN_CUDACC_VER
|
2017-08-24 11:06:47 +02:00 |
|
Konstantinos Margaritis
|
e1e71ca4e4
|
initial support for z14
|
2017-08-06 19:53:18 -04:00 |
|
Gael Guennebaud
|
24fe1de9b4
|
merge
|
2017-06-15 10:17:39 +02:00 |
|
Gael Guennebaud
|
b240080e64
|
bug #1436: fix compilation of Jacobi rotations with ARM NEON, some specializations of internal::conj_helper were missing.
|
2017-06-15 10:16:30 +02:00 |
|
Benoit Steiner
|
3baef62b9a
|
Added missing __device__ qualifier
|
2017-06-13 12:56:55 -07:00 |
|
Benoit Steiner
|
449936828c
|
Added missing __device__ qualifier
|
2017-06-13 12:54:57 -07:00 |
|
Gael Guennebaud
|
26f552c18d
|
fix compilation of Half in C++98 (issue introduced in previous commit)
|
2017-06-09 13:36:58 +02:00 |
|
Gael Guennebaud
|
1d59ca2458
|
Fix compilation with gcc 4.3 and ARM NEON
|
2017-06-09 13:20:52 +02:00 |
|
Gael Guennebaud
|
d588822779
|
Add missing std::numeric_limits specialization for half, and complete NumTraits<half>
|
2017-06-09 11:51:53 +02:00 |
|
Abhijit Kundu
|
9bc0a35731
|
Fixed nested angle barckets >> issue when compiling with cuda 8
|
2017-04-27 03:09:03 -04:00 |
|
Benoit Jacob
|
61160a21d2
|
ARM prefetch fixes: Implement prefetch on ARM64. Do not clobber cc on ARM32.
|
2017-03-15 06:57:25 -04:00 |
|
Gael Guennebaud
|
e958c2baac
|
remove UTF8 symbols
|
2017-03-07 10:47:40 +01:00 |
|
Benoit Steiner
|
7b61944669
|
Made most of the packet math primitives usable within CUDA kernel when compiling with clang
|
2017-02-28 17:05:28 -08:00 |
|
Benoit Steiner
|
34d9fce93b
|
Avoid unecessary float to double conversions.
|
2017-02-27 16:33:33 -08:00 |
|
Gael Guennebaud
|
cbbf88c4d7
|
Use int32_t instead of int in NEON code. Some platforms with 16 bytes int supports ARM NEON.
|
2017-02-17 14:39:02 +01:00 |
|
Rasmus Munk Larsen
|
5c9ed4ba0d
|
Reverse arguments for pmin in AVX.
|
2017-01-25 09:21:57 -08:00 |
|
Rasmus Munk Larsen
|
7b6aaa3440
|
Fix NaN propagation for AVX512.
|
2017-01-24 13:37:08 -08:00 |
|
Rasmus Munk Larsen
|
5e144bbaa4
|
Make NaN propagatation consistent between the pmax/pmin and std::max/std::min. This makes the NaN propagation consistent between the scalar and vectorized code paths of Eigen's scalar_max_op and scalar_min_op.
See #1373 for details.
|
2017-01-24 13:32:50 -08:00 |
|
Gael Guennebaud
|
ca79c1545a
|
Add std:: namespace prefix to all (hopefully) instances if size_t/ptrdfiff_t
|
2017-01-23 22:02:53 +01:00 |
|
Gael Guennebaud
|
bbd97b4095
|
Add a EIGEN_NO_CUDA option, and introduce EIGEN_CUDACC and EIGEN_CUDA_ARCH aliases
|
2017-07-17 01:02:51 +02:00 |
|
Benoit Steiner
|
354baa0fb1
|
Avoid using horizontal adds since they're not very efficient.
|
2016-12-21 20:55:07 -08:00 |
|
Benoit Steiner
|
d7825b6707
|
Use native AVX512 types instead of Eigen Packets whenever possible.
|
2016-12-21 20:06:18 -08:00 |
|
Benoit Steiner
|
923acadfac
|
Fixed compilation errors with gcc6 when compiling the AVX512 intrinsics
|
2016-12-19 13:02:27 -08:00 |
|
Benoit Jacob
|
751e097c57
|
Use 32 registers on ARM64
|
2016-12-19 13:44:46 -05:00 |
|
Gael Guennebaud
|
8c0e701504
|
bug #1360: fix sign issue with pmull on altivec
|
2016-12-18 22:13:19 +00:00 |
|
Gael Guennebaud
|
fc94258e77
|
Fix unused warning
|
2016-12-18 22:11:48 +00:00 |
|