Gael Guennebaud
|
d161b8f03a
|
Merged in carpent/eigen (pull request PR-204)
Use complete nested namespace Eigen::internal, thus making the custom static assertion macros available outside the Eigen's namespace.
|
2016-07-01 09:56:44 +02:00 |
|
Benoit Steiner
|
cb2d8b8fa6
|
Made it possible to compile reductions for an old cuda architecture and run them on a recent gpu.
|
2016-06-29 15:42:01 -07:00 |
|
Benoit Steiner
|
b2a47641ce
|
Made the code compile when using CUDA architecture < 300
|
2016-06-29 15:32:47 -07:00 |
|
Benoit Steiner
|
b047ca765f
|
Merged in ibab/eigen/fix-tensor-scan-gpu (pull request PR-205)
Add missing CUDA kernel to tensor scan op
|
2016-06-29 14:52:19 -07:00 |
|
Igor Babuschkin
|
85699850d9
|
Add missing CUDA kernel to tensor scan op
The TensorScanOp implementation was missing a CUDA kernel launch.
This adds a simple placeholder implementation.
|
2016-06-29 11:54:35 +01:00 |
|
Justin Carpentier
|
6126886a67
|
Use complete nested namespace Eigen::internal
|
2016-06-28 20:09:25 +02:00 |
|
Benoit Jacob
|
328c5d876a
|
Undo changes in AltiVec --- I don't have any way to test there.
|
2016-06-28 11:15:25 -04:00 |
|
Benoit Jacob
|
38fb606052
|
Avoid global variables with static constructors in NEON/Complex.h
|
2016-06-28 11:12:49 -04:00 |
|
Benoit Steiner
|
1a9f92e781
|
Added a test to validate the tensor scan evaluation on GPU. The test is currently disabled since the code segfaults.
|
2016-06-27 16:02:52 -07:00 |
|
Benoit Steiner
|
75c333f94c
|
Don't store the scan axis in the evaluator of the tensor scan operation since it's only used in the constructor.
Also avoid taking references to values that may becomes stale after a copy construction.
|
2016-06-27 10:32:38 -07:00 |
|
xantares
|
c52c8d76da
|
Disable pkgconfig only for native windows builds
ie enable it for MinGW
|
2016-06-27 16:43:08 +00:00 |
|
Gael Guennebaud
|
d937a420a2
|
Fix compilation with MSVC by using our portable numext::log1p implementation.
|
2016-08-22 15:44:21 +02:00 |
|
Gael Guennebaud
|
2d5731e40a
|
bug #1270: bypass custom asm for pmadd and recent clang version
|
2016-08-22 15:38:03 +02:00 |
|
Gael Guennebaud
|
49b005181a
|
Define EIGEN_COMP_CLANG to clang version as major*100+minor (e.g., 307 corresponds to clang 3.7)
|
2016-08-22 15:37:05 +02:00 |
|
Gael Guennebaud
|
130f891bb0
|
bug #1278: ease parsing
|
2016-08-22 15:00:29 +02:00 |
|
Benoit Steiner
|
7944d4431f
|
Made the cost model cwiseMax and cwiseMin methods consts to help the PowerPC cuda compiler compile this code.
|
2016-08-18 13:46:36 -07:00 |
|
Benoit Steiner
|
647a51b426
|
Force the inlining of a simple accessor.
|
2016-08-18 12:31:02 -07:00 |
|
Benoit Steiner
|
a452dedb4f
|
Merged in ibab/eigen/double-tensor-reduction (pull request PR-216)
Enable efficient Tensor reduction for doubles on the GPU (continued)
|
2016-08-18 12:29:54 -07:00 |
|
Igor Babuschkin
|
18c67df31c
|
Fix remaining CUDA >= 300 checks
|
2016-08-18 17:18:30 +01:00 |
|
Igor Babuschkin
|
1569a7d7ab
|
Add the necessary CUDA >= 300 checks back
|
2016-08-18 17:15:12 +01:00 |
|
Benoit Steiner
|
2b17f34574
|
Properly detect the type of the result of a contraction.
|
2016-08-16 16:00:30 -07:00 |
|
Igor Babuschkin
|
59bacfe520
|
Fix compilation on CUDA 8 by removing call to h2log1p
|
2016-08-15 23:38:05 +01:00 |
|
Benoit Steiner
|
34ae80179a
|
Use array_prod instead of calling TotalSize since TotalSize is only available on DSize.
|
2016-08-15 10:29:14 -07:00 |
|
Benoit Steiner
|
2556565b4b
|
Merged in ibab/eigen/extend-log1p (pull request PR-218)
Fix compilation on CUDA 8 due to missing h2log1p function
|
2016-08-15 08:31:03 -07:00 |
|
Benoit Steiner
|
30dd6f5e34
|
Close branch extend-log1p
|
2016-08-15 08:31:03 -07:00 |
|
Benoit Steiner
|
fe73648c98
|
Fixed a bug in the documentation.
|
2016-08-12 10:00:43 -07:00 |
|
Christoph Hertzberg
|
9636a8ed43
|
bug #1273: Add parentheses when redefining eigen_assert
|
2016-08-12 15:34:21 +02:00 |
|
Christoph Hertzberg
|
c83b754ee0
|
bug #1272: Disable assertion when total number of columns is zero.
Also moved assertion to finished() method and adapted unit-test
|
2016-08-12 15:15:34 +02:00 |
|
Benoit Steiner
|
e3a8dfb02f
|
std::erfcf doesn't exist: use numext::erfc instead
|
2016-08-11 15:24:06 -07:00 |
|
Benoit Steiner
|
64e68cbe87
|
Don't attempt to optimize partial reductions when the optimized implementation doesn't buy anything.
|
2016-08-08 19:29:59 -07:00 |
|
Benoit Steiner
|
5157ce8cbf
|
Merged in ibab/eigen/extend-log1p (pull request PR-217)
Add log1p support for CUDA and half floats
|
2016-08-08 14:50:00 -07:00 |
|
Igor Babuschkin
|
aee693ac52
|
Add log1p support for CUDA and half floats
|
2016-08-08 20:24:59 +01:00 |
|
Benoit Steiner
|
72096f3bd4
|
Merged in suiyuan2009/eigen/fix_tanh_inconsistent_for_tensorflow (pull request PR-215)
Fix_tanh_inconsistent_for_tensorflow
|
2016-08-08 09:06:45 -07:00 |
|
Christoph Hertzberg
|
3e4a33d4ba
|
bug #1272: Let CommaInitializer work for more border cases (enhances fix of bug #1242).
The unit test tests all combinations of 2x2 block-sizes from 0 to 3.
|
2016-08-08 17:26:48 +02:00 |
|
Igor Babuschkin
|
841e075154
|
Remove CUDA >= 300 checks and enable outer reductin for doubles
|
2016-08-06 18:07:50 +01:00 |
|
Ziming Dong
|
1031223c09
|
fix tanh inconsistent
|
2016-08-06 19:48:50 +08:00 |
|
Ziming Dong
|
5cf1e4c79b
|
create fix_tanh_inconsistent branch
|
2016-08-06 15:54:33 +08:00 |
|
Igor Babuschkin
|
0425118e2a
|
Merge upstream changes
|
2016-08-05 14:34:57 +01:00 |
|
Igor Babuschkin
|
9537e8b118
|
Make use of atomicExch for atomicExchCustom
|
2016-08-05 14:29:58 +01:00 |
|
Christoph Hertzberg
|
fe4b927e9c
|
Add aliases Eigen_*_DIR to Eigen3_*_DIR
This is to make configuring work again after project was renamed from Eigen to Eigen3
|
2016-08-05 15:21:14 +02:00 |
|
Benoit Steiner
|
fe778427f2
|
Fixed the constructors of the new half_base class.
|
2016-08-04 18:32:26 -07:00 |
|
Benoit Steiner
|
5eea1c7f97
|
Fixed cut and paste bug in debud message
|
2016-08-04 17:34:13 -07:00 |
|
Benoit Steiner
|
9506343349
|
Fixed the isnan, isfinite and isinf operations on GPU
|
2016-08-04 17:25:53 -07:00 |
|
Benoit Steiner
|
b50d8f8c4a
|
Extended a regression test to validate that we basic fp16 support works with cuda 7.0
|
2016-08-03 16:50:13 -07:00 |
|
Benoit Steiner
|
fad9828769
|
Deleted redundant regression test.
|
2016-08-03 16:08:37 -07:00 |
|
Benoit Steiner
|
373bb12dc6
|
Check that it's possible to forward declare the hlaf type.
|
2016-08-03 16:07:31 -07:00 |
|
Gael Guennebaud
|
17b9a55d98
|
Move Eigen::half_impl::half to Eigen::half while preserving the free functions to the Eigen::half_impl namespace together with ADL
|
2016-08-04 00:00:43 +02:00 |
|
Benoit Steiner
|
ca2cee2739
|
Merged in ibab/eigen (pull request PR-206)
Expose real and imag methods on Tensors
|
2016-08-03 11:53:04 -07:00 |
|
Benoit Steiner
|
d92df04ce8
|
Cleaned up the new float16 test a bit
|
2016-08-03 11:50:07 -07:00 |
|
Benoit Steiner
|
81099ef482
|
Added a test for fp16
|
2016-08-03 11:41:17 -07:00 |
|