Rasmus Munk Larsen
|
ab773c7e91
|
Extend support for Packet16b:
* Add ptranspose<*,4> to support matmul and add unit test for Matrix<bool> * Matrix<bool>
* work around a bug in slicing of Tensor<bool>.
* Add tensor tests
This speeds up matmul for boolean matrices by about 10x
name old time/op new time/op delta
BM_MatMul<bool>/8 267ns ± 0% 479ns ± 0% +79.25% (p=0.008 n=5+5)
BM_MatMul<bool>/32 6.42µs ± 0% 0.87µs ± 0% -86.50% (p=0.008 n=5+5)
BM_MatMul<bool>/64 43.3µs ± 0% 5.9µs ± 0% -86.42% (p=0.008 n=5+5)
BM_MatMul<bool>/128 315µs ± 0% 44µs ± 0% -85.98% (p=0.008 n=5+5)
BM_MatMul<bool>/256 2.41ms ± 0% 0.34ms ± 0% -85.68% (p=0.008 n=5+5)
BM_MatMul<bool>/512 18.8ms ± 0% 2.7ms ± 0% -85.53% (p=0.008 n=5+5)
BM_MatMul<bool>/1k 149ms ± 0% 22ms ± 0% -85.40% (p=0.008 n=5+5)
|
2020-04-28 16:12:47 +00:00 |
|
Christoph Hertzberg
|
c21771ac04
|
Use double-braces initialization (as everywhere else in the test-suite).
|
2019-12-19 19:20:48 +01:00 |
|
Eugene Zhulenev
|
ae07801dd8
|
Tensor block evaluation cost model
|
2019-12-18 20:07:00 +00:00 |
|
Eugene Zhulenev
|
1c879eb010
|
Remove V2 suffix from TensorBlock
|
2019-12-10 15:40:23 -08:00 |
|
Eugene Zhulenev
|
dbca11e880
|
Remove TensorBlock.h and old TensorBlock/BlockMapper
|
2019-12-10 14:31:44 -08:00 |
|
Eugene Zhulenev
|
0d2a14ce11
|
Cleanup Tensor block destination and materialized block storage allocation
|
2019-10-16 17:14:37 -07:00 |
|
Eugene Zhulenev
|
02431cbe71
|
TensorBroadcasting support for random/uniform blocks
|
2019-10-16 13:26:28 -07:00 |
|
Eugene Zhulenev
|
d380c23b2c
|
Block evaluation for TensorGenerator/TensorReverse/TensorShuffling
|
2019-10-14 14:31:59 -07:00 |
|
Eugene Zhulenev
|
a411e9f344
|
Block evaluation for TensorGenerator + TensorReverse + fixed bug in tensor reverse op
|
2019-10-10 10:56:58 -07:00 |
|
Eugene Zhulenev
|
33e1746139
|
Block evaluation for TensorChipping + fixed bugs in TensorPadding and TensorSlicing
|
2019-10-09 12:45:31 -07:00 |
|
Eugene Zhulenev
|
f74ab8cb8d
|
Add block evaluation to TensorEvalTo and fix few small bugs
|
2019-10-07 15:34:26 -07:00 |
|
Eugene Zhulenev
|
98bdd7252e
|
Fix compilation warnings and errors with clang in TensorBlockV2 code and tests
|
2019-10-04 10:15:33 -07:00 |
|
Eugene Zhulenev
|
60ae24ee1a
|
Add block evaluation to TensorReshaping/TensorCasting/TensorPadding/TensorSelect
|
2019-10-02 12:44:06 -07:00 |
|
Eugene Zhulenev
|
c97b208468
|
Add new TensorBlock api implementation + tests
|
2019-09-24 15:17:35 -07:00 |
|