Benoit Jacob
|
40a16282c7
|
Remove now-unused protate PacketMath func
|
2016-05-24 11:01:18 -04:00 |
|
Benoit Steiner
|
e617711306
|
Don't attempt to use MMX instructions with visualstudio since they're only partially supported.
|
2016-05-24 06:43:58 -07:00 |
|
Benoit Steiner
|
334e76537f
|
Worked around missing clang intrinsic
|
2016-05-24 00:29:28 -07:00 |
|
Benoit Steiner
|
b517ab349b
|
Use the generic ploadquad intrinsics since it does the job
|
2016-05-24 00:11:17 -07:00 |
|
Benoit Steiner
|
646872cb3b
|
Worked around missing clang intrinsics
|
2016-05-24 00:07:08 -07:00 |
|
Benoit Steiner
|
3dfc391a61
|
Added missing EIGEN_DEVICE_FUNC qualifier
|
2016-05-23 20:56:59 -07:00 |
|
Benoit Steiner
|
33a94f5dc7
|
Use the Index type instead of integers to specify the strides in pgather/pscatter
|
2016-05-23 20:37:30 -07:00 |
|
Benoit Steiner
|
6bc684ab6a
|
Added missing alignment in the fp16 packet traits
|
2016-05-23 20:32:30 -07:00 |
|
Benoit Steiner
|
283e33dea4
|
ptranspose is not a template.
|
2016-05-23 19:55:55 -07:00 |
|
Benoit Steiner
|
5ba0ebe7c9
|
Avoid unnecessary float to double conversion.
|
2016-05-23 17:14:31 -07:00 |
|
Benoit Steiner
|
7d980d74e5
|
Started to vectorize the processing of 16bit floats on CPU.
|
2016-05-23 15:21:40 -07:00 |
|
Christoph Hertzberg
|
88654762da
|
Replace multiple constructors of half-type by a generic/templated constructor. This fixes an incompatibility with long double, exposed by the previous commit.
|
2016-05-23 10:03:03 +02:00 |
|
Gael Guennebaud
|
1395056fc0
|
Make EIGEN_HAS_C99_MATH user configurable
|
2016-05-20 14:58:19 +02:00 |
|
Benoit Steiner
|
fae0493f98
|
Fixed a couple of bugs related to the Pascalfamily of GPUs
H: Enter commit message. Lines beginning with 'HG:' are removed.
|
2016-05-11 23:02:26 -07:00 |
|
Benoit Steiner
|
b6a517c47d
|
Added the ability to load fp16 using the texture path.
Improved the performance of some reductions on fp16
|
2016-05-11 21:26:48 -07:00 |
|
Benoit Steiner
|
518149e868
|
Misc fixes for fp16
|
2016-05-11 20:11:14 -07:00 |
|
Benoit Steiner
|
56a1757d74
|
Made predux_min and predux_max on fp16 less noisy
|
2016-05-11 17:37:34 -07:00 |
|
Benoit Steiner
|
9091351dbe
|
__ldg is only available with cuda architectures >= 3.5
|
2016-05-11 15:22:13 -07:00 |
|
Benoit Steiner
|
02f76dae2d
|
Fixed a typo
|
2016-05-11 15:08:38 -07:00 |
|
Benoit Steiner
|
70195a5ff7
|
Added missing EIGEN_DEVICE_FUNC
|
2016-05-11 14:10:09 -07:00 |
|
Benoit Steiner
|
09a19c33a8
|
Added missing EIGEN_DEVICE_FUNC qualifiers
|
2016-05-11 14:07:43 -07:00 |
|
Benoit Steiner
|
0b9e3dcd06
|
Added packet primitives to compute exp, log, sqrt and rsqrt on fp16. This improves the performance by 10 to 30%.
|
2016-05-10 11:05:33 -07:00 |
|
Benoit Steiner
|
8adf5cc70f
|
Added support for packet processing of fp16 on kepler and maxwell gpus
|
2016-05-06 19:16:43 -07:00 |
|
Benoit Steiner
|
0451940fa4
|
Relaxed the dummy precision for fp16
|
2016-05-05 15:40:01 -07:00 |
|
Benoit Steiner
|
6c3e5b85bc
|
Fixed compilation error with cuda >= 7.5
|
2016-05-03 09:38:42 -07:00 |
|
Benoit Steiner
|
da50419df8
|
Made a cast explicit
|
2016-05-02 19:50:22 -07:00 |
|
Benoit Steiner
|
2b890ae618
|
Fixed compilation errors generated by clang
|
2016-04-29 18:30:40 -07:00 |
|
Benoit Steiner
|
c61170e87d
|
fpclassify isn't portable enough. In particular, the return values of the function are not available on all the platforms Eigen supportes: remove it from Eigen.
|
2016-04-27 14:22:20 -07:00 |
|
Benoit Steiner
|
25141b69d4
|
Improved support for min and max on 16 bit floats when running on recent cuda gpus
|
2016-04-27 12:57:21 -07:00 |
|
Benoit Steiner
|
6744d776ba
|
Added support for fpclassify in Eigen::Numext
|
2016-04-27 12:10:25 -07:00 |
|
Benoit Steiner
|
7718749fee
|
Force the inlining of the << operator on half floats
|
2016-04-14 11:51:54 -07:00 |
|
Benoit Steiner
|
5379d2b594
|
Inline the << operator on half floats
|
2016-04-14 11:40:48 -07:00 |
|
Benoit Steiner
|
5c13765ee3
|
Added ability to printf fp16
|
2016-04-14 10:24:52 -07:00 |
|
Benoit Steiner
|
36f5a10198
|
Properly gate the definition of the error and gamma functions for fp16
|
2016-04-13 18:44:48 -07:00 |
|
Benoit Steiner
|
d6105b53b8
|
Added basic implementation of the lgamma, digamma, igamma, igammac, polygamma, and zeta function for fp16
|
2016-04-13 15:26:02 -07:00 |
|
Benoit Steiner
|
87ca15c4e8
|
Added support for sin, cos, tan, and tanh on fp16
|
2016-04-13 14:12:38 -07:00 |
|
Benoit Steiner
|
473c8380ea
|
Added constructors to convert unsigned integers into fp16
|
2016-04-13 11:03:37 -07:00 |
|
Benoit Steiner
|
833efb39bf
|
Added epsilon, dummy_precision, infinity and quiet_NaN NumTraits for fp16
|
2016-04-11 11:03:56 -07:00 |
|
Benoit Steiner
|
995f202cea
|
Disabled the use of half2 on cuda devices of compute capability < 5.3
|
2016-04-08 14:43:36 -07:00 |
|
Benoit Steiner
|
8d22967bd9
|
Initial support for taking the power of fp16
|
2016-04-08 14:22:39 -07:00 |
|
Benoit Steiner
|
3394379319
|
Fixed the packet_traits for half floats.
|
2016-04-08 13:33:59 -07:00 |
|
Benoit Steiner
|
737644366f
|
Move the functions operating on fp16 out of the std namespace and into the Eigen::numext namespace
|
2016-04-07 11:40:15 -07:00 |
|
Benoit Steiner
|
df838736e2
|
Fixed compilation warning triggered by msvc
|
2016-04-06 20:48:55 -07:00 |
|
Benoit Steiner
|
14ea7c7ec7
|
Fixed packet_traits<half>
|
2016-04-06 19:30:21 -07:00 |
|
Benoit Steiner
|
532fdf24cb
|
Added support for hardware conversion between fp16 and full floats whenever
possible.
|
2016-04-06 17:11:31 -07:00 |
|
Benoit Steiner
|
58c1dbff19
|
Made the fp16 code more portable.
|
2016-04-06 13:44:08 -07:00 |
|
Benoit Steiner
|
cf7e73addd
|
Added some missing conversions to the Half class, and fixed the implementation of the < operator on cuda devices.
|
2016-04-06 09:59:51 -07:00 |
|
Benoit Steiner
|
10bdd8e378
|
Merged in tillahoffmann/eigen (pull request PR-173)
Added zeta function of two arguments and polygamma function
|
2016-04-06 09:40:17 -07:00 |
|
Benoit Steiner
|
72abfa11dd
|
Added support for isfinite on fp16
|
2016-04-06 09:07:30 -07:00 |
|
tillahoffmann
|
49960adbdd
|
Merged eigen/eigen into default
|
2016-04-01 14:36:15 +01:00 |
|