Srinivas Vasudevan
|
218764ee1f
|
Added support for expm1 in Eigen.
|
2016-12-02 14:13:01 -08:00 |
|
Benoit Steiner
|
4387433acf
|
Increased the robustness of the reduction tests on fp16
|
2016-10-05 10:42:41 -07:00 |
|
Benoit Steiner
|
aad20d700d
|
Increase the tolerance to numerical noise.
|
2016-10-05 10:39:24 -07:00 |
|
Gael Guennebaud
|
dabc81751f
|
Fix compilation when cuda_fp16.h does not exist.
|
2016-09-05 17:14:20 +02:00 |
|
Gael Guennebaud
|
6cd7b9ea6b
|
Fix compilation with cuda 8
|
2016-08-29 11:06:08 +02:00 |
|
Gael Guennebaud
|
0f56b5a6de
|
enable vectorization path when testing half on cuda, and add test for log1p
|
2016-08-26 14:55:51 +02:00 |
|
Benoit Steiner
|
5eea1c7f97
|
Fixed cut and paste bug in debud message
|
2016-08-04 17:34:13 -07:00 |
|
Benoit Steiner
|
b50d8f8c4a
|
Extended a regression test to validate that we basic fp16 support works with cuda 7.0
|
2016-08-03 16:50:13 -07:00 |
|
Gael Guennebaud
|
d7a0e52478
|
Fix testing of log nearby 1
|
2016-07-22 15:44:26 +02:00 |
|
Gael Guennebaud
|
7acf23c14c
|
Truely split unit test.
|
2016-07-22 15:41:23 +02:00 |
|
Benoit Steiner
|
36369ab63c
|
Resolved merge conflicts
|
2016-05-26 13:39:39 -07:00 |
|
Benoit Steiner
|
28fcb5ca2a
|
Merged latest reduction improvements
|
2016-05-26 12:19:33 -07:00 |
|
Benoit Steiner
|
c1c7f06c35
|
Improved the performance of inner reductions.
|
2016-05-26 11:53:59 -07:00 |
|
Benoit Steiner
|
22d02c9855
|
Improved the coverage of the fp16 reduction tests
|
2016-05-26 11:12:16 -07:00 |
|
Benoit Steiner
|
8d06c02ffd
|
Allow vectorized padding on GPU. This helps speed things up a little.
Before:
BM_padding/10 5000000 460 217.03 MFlops/s
BM_padding/80 5000000 460 13899.40 MFlops/s
BM_padding/640 5000000 461 888421.17 MFlops/s
BM_padding/4K 5000000 460 54316322.55 MFlops/s
After:
BM_padding/10 5000000 454 220.20 MFlops/s
BM_padding/80 5000000 455 14039.86 MFlops/s
BM_padding/640 5000000 452 904968.83 MFlops/s
BM_padding/4K 5000000 411 60750049.21 MFlops/s
|
2016-05-17 09:13:27 -07:00 |
|
Benoit Steiner
|
fae0493f98
|
Fixed a couple of bugs related to the Pascalfamily of GPUs
H: Enter commit message. Lines beginning with 'HG:' are removed.
|
2016-05-11 23:02:26 -07:00 |
|
Benoit Steiner
|
595e890391
|
Added more tests for half floats
|
2016-05-11 21:27:15 -07:00 |
|
Benoit Steiner
|
217d984abc
|
Fixed a typo in my previous commit
|
2016-05-11 10:22:15 -07:00 |
|
Benoit Steiner
|
691614bd2c
|
Worked around a bug in nvcc on tegra x1
|
2016-05-07 13:28:53 -07:00 |
|
Benoit Steiner
|
69a8a4e1f3
|
Added a test to validate full reduction on tensor of half floats
|
2016-05-05 16:52:50 -07:00 |
|
Benoit Steiner
|
678a17ba79
|
Made the testing of contractions on fp16 more robust
|
2016-05-05 16:36:39 -07:00 |
|
Benoit Steiner
|
e3d053e14e
|
Refined the testing of log and exp on fp16
|
2016-05-05 16:24:15 -07:00 |
|
Benoit Steiner
|
9a48688d37
|
Further improved the testing of fp16
|
2016-05-05 15:58:05 -07:00 |
|
Benoit Steiner
|
2c5568a757
|
Added a test to validate the computation of exp and log on 16bit floats
|
2016-05-03 12:06:07 -07:00 |
|
Benoit Steiner
|
4a164d2c46
|
Fixed the partial evaluation of non vectorizable tensor subexpressions
|
2016-04-25 10:43:03 -07:00 |
|
Benoit Steiner
|
7a570e50ef
|
Fixed contractions of fp16
|
2016-03-23 16:00:06 -07:00 |
|
Benoit Steiner
|
ab9b749b45
|
Improved a test
|
2016-03-14 20:03:13 -07:00 |
|
Benoit Steiner
|
e644f60907
|
Pulled latest updates from trunk
|
2016-02-21 20:24:59 +00:00 |
|
Benoit Steiner
|
95fceb6452
|
Added the ability to compute the absolute value of a half float
|
2016-02-21 20:24:11 +00:00 |
|
Benoit Steiner
|
ed69cbeef0
|
Added some debugging information to the test to figure out why it fails sometimes
|
2016-02-21 11:20:20 -08:00 |
|
Benoit Steiner
|
1e6fe6f046
|
Fixed the float16 tensor test.
|
2016-02-20 07:44:17 +00:00 |
|
Benoit Steiner
|
180156ba1a
|
Added support for tensor reductions on half floats
|
2016-02-19 10:05:59 -08:00 |
|
Benoit Steiner
|
a08d2ff0c9
|
Started to work on contractions and reductions using half floats
|
2016-02-19 15:59:59 +00:00 |
|
Benoit Steiner
|
ac5d706a94
|
Added support for simple coefficient wise tensor expression using half floats on CUDA devices
|
2016-02-19 08:19:12 +00:00 |
|
Benoit Steiner
|
0606a0a39b
|
FP16 on CUDA are only available starting with cuda 7.5. Disable them when using an older version of CUDA
|
2016-02-18 23:15:23 -08:00 |
|
Benoit Steiner
|
f36c0c2c65
|
Added regression test for float16
|
2016-02-19 06:23:28 +00:00 |
|