Benoit Steiner
|
b6a517c47d
|
Added the ability to load fp16 using the texture path.
Improved the performance of some reductions on fp16
|
2016-05-11 21:26:48 -07:00 |
|
Benoit Steiner
|
518149e868
|
Misc fixes for fp16
|
2016-05-11 20:11:14 -07:00 |
|
Benoit Steiner
|
56a1757d74
|
Made predux_min and predux_max on fp16 less noisy
|
2016-05-11 17:37:34 -07:00 |
|
Benoit Steiner
|
9091351dbe
|
__ldg is only available with cuda architectures >= 3.5
|
2016-05-11 15:22:13 -07:00 |
|
Benoit Steiner
|
02f76dae2d
|
Fixed a typo
|
2016-05-11 15:08:38 -07:00 |
|
Benoit Steiner
|
70195a5ff7
|
Added missing EIGEN_DEVICE_FUNC
|
2016-05-11 14:10:09 -07:00 |
|
Benoit Steiner
|
09a19c33a8
|
Added missing EIGEN_DEVICE_FUNC qualifiers
|
2016-05-11 14:07:43 -07:00 |
|
Benoit Steiner
|
0b9e3dcd06
|
Added packet primitives to compute exp, log, sqrt and rsqrt on fp16. This improves the performance by 10 to 30%.
|
2016-05-10 11:05:33 -07:00 |
|
Benoit Steiner
|
8adf5cc70f
|
Added support for packet processing of fp16 on kepler and maxwell gpus
|
2016-05-06 19:16:43 -07:00 |
|
Benoit Steiner
|
0451940fa4
|
Relaxed the dummy precision for fp16
|
2016-05-05 15:40:01 -07:00 |
|
Benoit Steiner
|
4c05fb03a3
|
Merged eigen/eigen into default
|
2016-05-03 13:15:00 -07:00 |
|
Benoit Steiner
|
6c3e5b85bc
|
Fixed compilation error with cuda >= 7.5
|
2016-05-03 09:38:42 -07:00 |
|
Benoit Steiner
|
da50419df8
|
Made a cast explicit
|
2016-05-02 19:50:22 -07:00 |
|
Benoit Steiner
|
2b890ae618
|
Fixed compilation errors generated by clang
|
2016-04-29 18:30:40 -07:00 |
|
Benoit Steiner
|
07a247dcf4
|
Pulled latest updates from upstream
|
2016-04-29 13:41:26 -07:00 |
|
Benoit Steiner
|
fa5a8f055a
|
Implemented palign_impl for AVX512
|
2016-04-29 13:30:13 -07:00 |
|
Benoit Steiner
|
ef3ac9d05a
|
Fixed the AVX512 packet traits
|
2016-04-29 13:28:36 -07:00 |
|
Benoit Steiner
|
d7b75e8d86
|
Added pdiv packet primitives for avx512
|
2016-04-29 13:26:47 -07:00 |
|
Benoit Steiner
|
5e89ded685
|
Implemented preduxp for AVX512
|
2016-04-29 13:00:33 -07:00 |
|
Benoit Steiner
|
5f85662ad8
|
Implemented the pabs and preverse primitives for avx512.
|
2016-04-29 12:53:34 -07:00 |
|
Benoit Steiner
|
d37ee89ca8
|
Disabled some of the AVX512 primitives on compilers that don't support them
|
2016-04-29 12:50:29 -07:00 |
|
Konstantinos Margaritis
|
87294c84a6
|
define Packet2d constants with VSX only
|
2016-04-28 14:39:56 -03:00 |
|
Konstantinos Margaritis
|
6ed7a7281c
|
remove accidentally pasted code
|
2016-04-28 14:35:55 -03:00 |
|
Konstantinos Margaritis
|
62f9093b31
|
improve state of MathFunctions as well
|
2016-04-28 14:33:09 -03:00 |
|
Konstantinos Margaritis
|
8ed26120c8
|
bring Altivec/VSX to a better state, implement some of the missing functions
|
2016-04-28 14:32:42 -03:00 |
|
Konstantinos Margaritis
|
950158f6d1
|
add name to copyrights
|
2016-04-28 14:32:11 -03:00 |
|
Konstantinos Margaritis
|
ee0459300b
|
minor fix, add to copyright
|
2016-04-28 14:31:21 -03:00 |
|
Benoit Steiner
|
c61170e87d
|
fpclassify isn't portable enough. In particular, the return values of the function are not available on all the platforms Eigen supportes: remove it from Eigen.
|
2016-04-27 14:22:20 -07:00 |
|
Benoit Steiner
|
25141b69d4
|
Improved support for min and max on 16 bit floats when running on recent cuda gpus
|
2016-04-27 12:57:21 -07:00 |
|
Benoit Steiner
|
6744d776ba
|
Added support for fpclassify in Eigen::Numext
|
2016-04-27 12:10:25 -07:00 |
|
Konstantinos Margaritis
|
0e8fc31087
|
remove pgather/pscatter for std::complex<double> for s390x
|
2016-04-15 07:08:57 -04:00 |
|
Benoit Steiner
|
7718749fee
|
Force the inlining of the << operator on half floats
|
2016-04-14 11:51:54 -07:00 |
|
Benoit Steiner
|
5379d2b594
|
Inline the << operator on half floats
|
2016-04-14 11:40:48 -07:00 |
|
Benoit Steiner
|
5c13765ee3
|
Added ability to printf fp16
|
2016-04-14 10:24:52 -07:00 |
|
Benoit Steiner
|
36f5a10198
|
Properly gate the definition of the error and gamma functions for fp16
|
2016-04-13 18:44:48 -07:00 |
|
Benoit Steiner
|
d6105b53b8
|
Added basic implementation of the lgamma, digamma, igamma, igammac, polygamma, and zeta function for fp16
|
2016-04-13 15:26:02 -07:00 |
|
Benoit Steiner
|
87ca15c4e8
|
Added support for sin, cos, tan, and tanh on fp16
|
2016-04-13 14:12:38 -07:00 |
|
Benoit Steiner
|
473c8380ea
|
Added constructors to convert unsigned integers into fp16
|
2016-04-13 11:03:37 -07:00 |
|
Benoit Steiner
|
8bfe739cd2
|
Updated the AVX512 PacketMath to properly leverage the AVX512DQ instructions
|
2016-04-11 18:40:16 -07:00 |
|
Benoit Steiner
|
d6e596174d
|
Pull latest updates from upstream
|
2016-04-11 17:20:17 -07:00 |
|
Benoit Steiner
|
833efb39bf
|
Added epsilon, dummy_precision, infinity and quiet_NaN NumTraits for fp16
|
2016-04-11 11:03:56 -07:00 |
|
Benoit Steiner
|
995f202cea
|
Disabled the use of half2 on cuda devices of compute capability < 5.3
|
2016-04-08 14:43:36 -07:00 |
|
Benoit Steiner
|
8d22967bd9
|
Initial support for taking the power of fp16
|
2016-04-08 14:22:39 -07:00 |
|
Benoit Steiner
|
3394379319
|
Fixed the packet_traits for half floats.
|
2016-04-08 13:33:59 -07:00 |
|
Benoit Jacob
|
cd2b667ac8
|
Add references to filed LLVM bugs
|
2016-04-08 08:12:47 -04:00 |
|
Benoit Steiner
|
737644366f
|
Move the functions operating on fp16 out of the std namespace and into the Eigen::numext namespace
|
2016-04-07 11:40:15 -07:00 |
|
Benoit Steiner
|
df838736e2
|
Fixed compilation warning triggered by msvc
|
2016-04-06 20:48:55 -07:00 |
|
Benoit Steiner
|
14ea7c7ec7
|
Fixed packet_traits<half>
|
2016-04-06 19:30:21 -07:00 |
|
Benoit Steiner
|
532fdf24cb
|
Added support for hardware conversion between fp16 and full floats whenever
possible.
|
2016-04-06 17:11:31 -07:00 |
|
Benoit Steiner
|
58c1dbff19
|
Made the fp16 code more portable.
|
2016-04-06 13:44:08 -07:00 |
|