Gael Guennebaud
|
7acf23c14c
|
Truely split unit test.
|
2016-07-22 15:41:23 +02:00 |
|
Benoit Steiner
|
36369ab63c
|
Resolved merge conflicts
|
2016-05-26 13:39:39 -07:00 |
|
Benoit Steiner
|
28fcb5ca2a
|
Merged latest reduction improvements
|
2016-05-26 12:19:33 -07:00 |
|
Benoit Steiner
|
c1c7f06c35
|
Improved the performance of inner reductions.
|
2016-05-26 11:53:59 -07:00 |
|
Benoit Steiner
|
22d02c9855
|
Improved the coverage of the fp16 reduction tests
|
2016-05-26 11:12:16 -07:00 |
|
Benoit Steiner
|
8d06c02ffd
|
Allow vectorized padding on GPU. This helps speed things up a little.
Before:
BM_padding/10 5000000 460 217.03 MFlops/s
BM_padding/80 5000000 460 13899.40 MFlops/s
BM_padding/640 5000000 461 888421.17 MFlops/s
BM_padding/4K 5000000 460 54316322.55 MFlops/s
After:
BM_padding/10 5000000 454 220.20 MFlops/s
BM_padding/80 5000000 455 14039.86 MFlops/s
BM_padding/640 5000000 452 904968.83 MFlops/s
BM_padding/4K 5000000 411 60750049.21 MFlops/s
|
2016-05-17 09:13:27 -07:00 |
|
Benoit Steiner
|
fae0493f98
|
Fixed a couple of bugs related to the Pascalfamily of GPUs
H: Enter commit message. Lines beginning with 'HG:' are removed.
|
2016-05-11 23:02:26 -07:00 |
|
Benoit Steiner
|
595e890391
|
Added more tests for half floats
|
2016-05-11 21:27:15 -07:00 |
|
Benoit Steiner
|
217d984abc
|
Fixed a typo in my previous commit
|
2016-05-11 10:22:15 -07:00 |
|
Benoit Steiner
|
691614bd2c
|
Worked around a bug in nvcc on tegra x1
|
2016-05-07 13:28:53 -07:00 |
|
Benoit Steiner
|
69a8a4e1f3
|
Added a test to validate full reduction on tensor of half floats
|
2016-05-05 16:52:50 -07:00 |
|
Benoit Steiner
|
678a17ba79
|
Made the testing of contractions on fp16 more robust
|
2016-05-05 16:36:39 -07:00 |
|
Benoit Steiner
|
e3d053e14e
|
Refined the testing of log and exp on fp16
|
2016-05-05 16:24:15 -07:00 |
|
Benoit Steiner
|
9a48688d37
|
Further improved the testing of fp16
|
2016-05-05 15:58:05 -07:00 |
|
Benoit Steiner
|
2c5568a757
|
Added a test to validate the computation of exp and log on 16bit floats
|
2016-05-03 12:06:07 -07:00 |
|
Benoit Steiner
|
4a164d2c46
|
Fixed the partial evaluation of non vectorizable tensor subexpressions
|
2016-04-25 10:43:03 -07:00 |
|
Benoit Steiner
|
7a570e50ef
|
Fixed contractions of fp16
|
2016-03-23 16:00:06 -07:00 |
|
Benoit Steiner
|
ab9b749b45
|
Improved a test
|
2016-03-14 20:03:13 -07:00 |
|
Benoit Steiner
|
e644f60907
|
Pulled latest updates from trunk
|
2016-02-21 20:24:59 +00:00 |
|
Benoit Steiner
|
95fceb6452
|
Added the ability to compute the absolute value of a half float
|
2016-02-21 20:24:11 +00:00 |
|
Benoit Steiner
|
ed69cbeef0
|
Added some debugging information to the test to figure out why it fails sometimes
|
2016-02-21 11:20:20 -08:00 |
|
Benoit Steiner
|
1e6fe6f046
|
Fixed the float16 tensor test.
|
2016-02-20 07:44:17 +00:00 |
|
Benoit Steiner
|
180156ba1a
|
Added support for tensor reductions on half floats
|
2016-02-19 10:05:59 -08:00 |
|
Benoit Steiner
|
a08d2ff0c9
|
Started to work on contractions and reductions using half floats
|
2016-02-19 15:59:59 +00:00 |
|
Benoit Steiner
|
ac5d706a94
|
Added support for simple coefficient wise tensor expression using half floats on CUDA devices
|
2016-02-19 08:19:12 +00:00 |
|
Benoit Steiner
|
0606a0a39b
|
FP16 on CUDA are only available starting with cuda 7.5. Disable them when using an older version of CUDA
|
2016-02-18 23:15:23 -08:00 |
|
Benoit Steiner
|
f36c0c2c65
|
Added regression test for float16
|
2016-02-19 06:23:28 +00:00 |
|