Benoit Steiner
|
a4d6e8fef0
|
Strongly hint but don't force the compiler to unroll a some loops in the tensor executor. This results in up to 27% faster code.
|
2016-05-05 09:25:55 -07:00 |
|
Benoit Steiner
|
f363e533aa
|
Added tests for full contractions using thread pools and gpu devices.
Fixed a couple of issues in the corresponding code.
|
2016-05-05 09:05:45 -07:00 |
|
Benoit Steiner
|
06d774bf58
|
Updated the contraction code to ensure that full contraction return a tensor of rank 0
|
2016-05-05 08:37:47 -07:00 |
|
Christoph Hertzberg
|
dacb469bc9
|
Enable and fix -Wdouble-conversion warnings
|
2016-05-05 13:35:45 +02:00 |
|
Benoit Steiner
|
dd2b45feed
|
Removed extraneous 'explicit' keywords
|
2016-05-04 16:57:52 -07:00 |
|
Benoit Steiner
|
968ec1c2ae
|
Use numext::isfinite instead of std::isfinite
|
2016-05-03 19:56:40 -07:00 |
|
Benoit Steiner
|
aad9a04da4
|
Deleted superfluous explicit keyword.
|
2016-05-03 09:37:19 -07:00 |
|
Benoit Steiner
|
8a9228ed9b
|
Fixed compilation error
|
2016-05-01 14:48:01 -07:00 |
|
Benoit Steiner
|
d6c9596fd8
|
Added missing accessors to fixed sized tensors
|
2016-04-29 18:51:33 -07:00 |
|
Benoit Steiner
|
17fe7f354e
|
Deleted trailing commas
|
2016-04-29 18:39:01 -07:00 |
|
Benoit Steiner
|
e5f71aa6b2
|
Deleted useless trailing commas
|
2016-04-29 18:36:10 -07:00 |
|
Benoit Steiner
|
44f592dceb
|
Deleted unnecessary trailing commas.
|
2016-04-29 18:33:46 -07:00 |
|
Benoit Steiner
|
f100d1494c
|
Return the proper size (ie 1) for tensors of rank 0
|
2016-04-29 18:14:33 -07:00 |
|
Benoit Steiner
|
a8c0405cf5
|
Deleted unused default values for template parameters
|
2016-04-29 16:34:43 -07:00 |
|
Benoit Steiner
|
c07404f6a1
|
Restore Tensor support for non c++11 compilers
|
2016-04-29 15:19:19 -07:00 |
|
Benoit Steiner
|
ba32ded021
|
Fixed include path
|
2016-04-29 15:11:09 -07:00 |
|
Gael Guennebaud
|
318e65e0ae
|
Fix missing inclusion of Eigen/Core
|
2016-04-27 23:05:40 +02:00 |
|
Rasmus Munk Larsen
|
463738ccbe
|
Use computeProductBlockingSizes to compute blocking for both ShardByCol and ShardByRow cases.
|
2016-04-27 12:26:18 -07:00 |
|
Gael Guennebaud
|
3dddd34133
|
Refactor the unsupported CXX11/Core module to internal headers only.
|
2016-04-26 11:20:25 +02:00 |
|
Benoit Steiner
|
4a164d2c46
|
Fixed the partial evaluation of non vectorizable tensor subexpressions
|
2016-04-25 10:43:03 -07:00 |
|
Benoit Steiner
|
fd9401f260
|
Refined the cost of the striding operation.
|
2016-04-25 09:16:08 -07:00 |
|
Benoit Steiner
|
4bbc97be5e
|
Provide access to the base threadpool classes
|
2016-04-21 17:59:33 -07:00 |
|
Benoit Steiner
|
33adce5c3a
|
Added the ability to switch to the new thread pool with a #define
|
2016-04-21 11:59:58 -07:00 |
|
Benoit Steiner
|
f670613e4b
|
Fixed several compilation warnings
|
2016-04-21 11:03:02 -07:00 |
|
Benoit Steiner
|
2dde1b1028
|
Don't crash when attempting to reduce empty tensors.
|
2016-04-20 18:08:20 -07:00 |
|
Benoit Steiner
|
c7c2054bb5
|
Started to implement a portable way to yield.
|
2016-04-19 17:59:58 -07:00 |
|
Benoit Steiner
|
2b72163028
|
Implemented a more portable version of thread local variables
|
2016-04-19 15:56:02 -07:00 |
|
Benoit Steiner
|
5b1106c56b
|
Fixed a compilation error with nvcc 7.
|
2016-04-19 14:57:57 -07:00 |
|
Benoit Steiner
|
7129d998db
|
Simplified the code that launches cuda kernels.
|
2016-04-19 14:55:21 -07:00 |
|
Benoit Steiner
|
b9ea40c30d
|
Don't take the address of a kernel on CUDA devices that don't support this feature.
|
2016-04-19 14:35:11 -07:00 |
|
Benoit Steiner
|
884c075058
|
Use numext::ceil instead of std::ceil
|
2016-04-19 14:33:30 -07:00 |
|
Benoit Steiner
|
a278414d1b
|
Avoid an unnecessary copy of the evaluator.
|
2016-04-19 13:54:28 -07:00 |
|
Benoit Steiner
|
50968a0a3e
|
Use DenseIndex in the MeanReducer to avoid overflows when processing very large tensors.
|
2016-04-19 11:53:58 -07:00 |
|
Benoit Steiner
|
c8e8f93d6c
|
Move the evalGemm method into the TensorContractionEvaluatorBase class to make it accessible from both the single and multithreaded contraction evaluators.
|
2016-04-15 16:48:10 -07:00 |
|
Benoit Steiner
|
7cff898e0a
|
Deleted unnecessary variable
|
2016-04-15 15:46:14 -07:00 |
|
Benoit Steiner
|
6c43c49e4a
|
Fixed a few compilation warnings
|
2016-04-15 15:34:34 -07:00 |
|
Benoit Steiner
|
eb669f989f
|
Merged in rmlarsen/eigen (pull request PR-178)
Eigen Tensor cost model part 2: Thread scheduling for standard evaluators and reductions.
|
2016-04-15 14:53:15 -07:00 |
|
Rasmus Munk Larsen
|
3718bf654b
|
Get rid of void* casting when calling EvalRange::run.
|
2016-04-15 12:51:33 -07:00 |
|
Benoit Steiner
|
a62e924656
|
Added ability to access the cache sizes from the tensor devices
|
2016-04-14 21:25:06 -07:00 |
|
Benoit Steiner
|
18e6f67426
|
Added support for exclusive or
|
2016-04-14 20:37:46 -07:00 |
|
Rasmus Munk Larsen
|
07ac4f7e02
|
Eigen Tensor cost model part 2: Thread scheduling for standard evaluators and reductions. The cost model is turned off by default.
|
2016-04-14 18:28:23 -07:00 |
|
Benoit Steiner
|
9624a1ea3d
|
Added missing definition of PacketSize in the gpu evaluator of convolution
|
2016-04-14 17:16:58 -07:00 |
|
Benoit Steiner
|
6fbedf5a4e
|
Merged in rmlarsen/eigen (pull request PR-177)
Eigen Tensor cost model part 1.
|
2016-04-14 17:13:19 -07:00 |
|
Benoit Steiner
|
9c064b5a97
|
Cleanup
|
2016-04-14 16:41:31 -07:00 |
|
Benoit Steiner
|
1372156c41
|
Prepared the migration to the new non blocking thread pool
|
2016-04-14 16:16:42 -07:00 |
|
Rasmus Munk Larsen
|
aeb5494a0b
|
Improvements to cost model.
|
2016-04-14 15:52:58 -07:00 |
|
Benoit Steiner
|
78a51abc12
|
Added a more scalable non blocking thread pool
|
2016-04-14 15:23:10 -07:00 |
|
Rasmus Munk Larsen
|
d2e95492e7
|
Merge upstream updates.
|
2016-04-14 13:59:50 -07:00 |
|
Rasmus Munk Larsen
|
235e83aba6
|
Eigen cost model part 1. This implements a basic recursive framework to estimate the cost of evaluating tensor expressions.
|
2016-04-14 13:57:35 -07:00 |
|
Benoit Steiner
|
5912ad877c
|
Silenced a compilation warning
|
2016-04-14 11:40:14 -07:00 |
|