Erik Schultheis
e939c06b0e
Small speed-up in row-major sparse dense product
2021-12-15 18:46:25 +00:00
Erik Schultheis
c20e908ebc
turn some macros intro constexpr functions
2021-12-10 19:27:01 +00:00
Erik Schultheis
0f36e42169
Fix
2021-12-10 16:59:48 +00:00
Rasmus Munk Larsen
f04fd8b168
Make sure exp(-Inf) is zero for vectorized expressions. This fixes #2385 .
2021-12-08 17:57:23 +00:00
Erik Schultheis
cc11e240ac
Some further cleanup
2021-12-06 18:01:15 +00:00
Rasmus Munk Larsen
3ffefcb95c
Only include <atomic> if needed.
2021-12-02 23:55:25 +00:00
Erik Schultheis
d60f7fa518
Improved lapacke binding code for HouseholderQR and PartialPivLU
2021-12-02 00:10:58 +00:00
Erik Schultheis
ec2fd0f7ed
Require recent GCC and MSCV and removed EIGEN_HAS_CXX14
and some other feature test macros
2021-12-01 00:48:34 +00:00
Rasmus Munk Larsen
085c2fc5d5
Revert "Update SVD Module to allow specifying computation options with a...
2021-11-30 18:45:54 +00:00
Erik Schultheis
4dd126c630
fixed cholesky with 0 sized matrix (cf. #785 )
2021-11-30 17:17:41 +00:00
Rohit Santhanam
4d3e50036f
Fix for HIP compilation breakage in selfAdjoint and triangular view classes.
2021-11-30 14:00:59 +00:00
Erik Schultheis
63abb35dfd
SFINAE'ing away non-const overloads if selfAdjoint/triangular view is not referring to an lvalue
2021-11-29 22:51:26 +00:00
Jakub Gałecki
1b8dce564a
bugfix: issue #2375
2021-11-29 22:26:15 +00:00
Francesco Mazzoli
eb85b97339
Select AVX2 even if the data size is not a multiple of 8
2021-11-29 21:13:24 +00:00
Arthur
eef33946b7
Update SVD Module to allow specifying computation options with a template parameter. Resolves #2051
2021-11-29 20:50:46 +00:00
Erik Schultheis
f33a31b823
removed EIGEN_HAS_CXX11_* and redundant EIGEN_COMP_CXXVER checks
2021-11-29 19:18:57 +00:00
Rohit Santhanam
9d3ffb3fbf
Fix for HIP compilation failure in DenseBase.
2021-11-28 15:59:30 +00:00
David Tellenbach
08da52eb85
Remove DenseBase::nonZeros() which just calls DenseBase::size()
...
Fixes #2382 .
2021-11-27 14:31:00 +00:00
Ali Can Demiralp
96e537d6fd
Add EIGEN_DEVICE_FUNC to DenseBase::hasNaN() and DenseBase::allFinite().
2021-11-27 11:27:52 +00:00
Erik Schultheis
b8b6566f0f
Currently, the binding of LLT to Lapacke is done using a large macro. This factors out a large part of the functionality of the macro and implement them explicitly.
2021-11-25 16:11:25 +00:00
Erik Schultheis
ec4efbd696
remove EIGEN_HAS_CXX11
2021-11-24 20:08:49 +00:00
Rasmus Munk Larsen
5137a5157a
Make numeric_limits members constexpr as per the newer C++ standards.
...
Author: majnemer@google.com
2021-11-19 15:58:36 +00:00
Erik Schultheis
7e586635ba
don't use deprecated MappedSparseMatrix
2021-11-19 15:58:04 +00:00
Erik Schultheis
b0fb5417d3
Fixed Sparse-Sparse Product in case of mixed StorageIndex types
2021-11-18 18:33:31 +00:00
Pablo Speciale
d04edff570
Update Umeyama.h: src_var
is only used when with_scaling == true
. Therefore, the actual computation can be avoided when with_scaling == false
.
2021-11-16 17:58:22 +00:00
Rasmus Munk Larsen
2b9297196c
Update Transform.h to make transform_construct_from_matrix
and transform_take_affine_part
callable from device code. Fixes #2377 .
2021-11-16 00:58:30 +00:00
Erik Schultheis
ca9c848679
use consistent StorageIndex
types in SparseMatrix::Map
...
and `SparseMatrix::TransposedSparseMatrix`
2021-11-15 22:18:26 +00:00
Erik Schultheis
13954c4440
moved pruning code to SparseVector.h
2021-11-15 22:16:01 +00:00
Nathan Luehr
da79095923
Convert diag pragmas to nv_diag.
2021-11-15 03:42:42 +00:00
Erik Schultheis
532cc73f39
fix a typo
2021-11-13 13:11:06 +02:00
Gengxin Xie
5c642950a5
Bug Fix: correct the bug that won't define EIGEN_HAS_FP16_C
...
if the compiler isn't clang
2021-11-04 22:13:01 +00:00
Gilad
0d73440fb2
Documentation of Quaternion constructor from MatrixBase
2021-11-04 16:21:26 +00:00
Xinle Liu
478a1bdda6
Fix total deflation issue in BDCSVD, when & only when M is already diagonal.
2021-11-02 16:53:55 +00:00
Chip Kerchner
9cf34ee0ae
Invert rows and depth in non-vectorized portion of packing (PowerPC).
2021-10-28 21:59:41 +00:00
Ilya Tokar
e1cb6369b0
Add AVX vector path to float2half/half2float
...
Makes e. g. matrix multiplication 2x faster:
name old cpu/op new cpu/op delta
BM_convers 181ms ± 1% 62ms ± 9% -65.82% (p=0.016 n=4+5)
Tested on all possible input values (not adding tests, since they
take a long time).
2021-10-28 13:59:01 -04:00
Antonio Sanchez
03d4cbb307
Fix min/max nan-propagation for scalar "other".
...
Copied input type from `EIGEN_MAKE_CWISE_BINARY_OP`.
Fixes #2362 .
2021-10-28 09:28:29 -07:00
Antonio Sanchez
e559701981
Fix compile issue for gcc 4.8
2021-10-28 08:23:19 -07:00
Rohit Santhanam
48e40b22bf
Preliminary HIP bfloat16 GPU support.
2021-10-27 18:36:45 +00:00
Antonio Sanchez
40bbe8a4d0
Fix ZVector build.
...
Cross-compiled via `s390x-linux-gnu-g++`, run via qemu. This allows the
packetmath tests to pass.
2021-10-27 16:30:15 +00:00
Alex Druinsky
6bb6a6bf53
Vectorize fp16 tanh and logistic functions on Neon
...
Activates vectorization of the Eigen::half versions of the tanh and
logistic functions when they run on Neon. Both functions convert their
inputs to float before computing the output, and as a result of this
commit, the conversions and the computation in float are vectorized.
2021-10-27 16:09:16 +00:00
Andreas Krebbel
8faafc3aaa
ZVector: Move alignas qualifier to come first
...
We currently have plenty of type definitions with the alignment
qualifier coming after the type. The compiler warns about ignoring
them:
int EIGEN_ALIGN16 ai[4];
Turn this into:
EIGEN_ALIGN16 int ai[4];
2021-10-26 15:33:47 +02:00
Alex Druinsky
d0e3791b1a
Fix vectorized reductions for Eigen::half
...
Fixes compiler errors in expressions that look like
Eigen::Matrix<Eigen::half, 3, 1>::Random().maxCoeff()
The error comes from the code that creates the initial value for
vectorized reductions. The fix is to specify the scalar type of the
reduction's initial value.
The cahnge is necessary for Eigen::half because unlike other types,
Eigen::half scalars cannot be implicitly created from integers.
2021-10-25 14:44:33 -07:00
Yann Billeter
6c3206152a
fix(CommaInitializer): pass dims at compile-time
2021-10-25 19:53:38 +00:00
Antonio Sanchez
0578feaabc
Remove const from visitor return type.
...
This seems to interfere with `pload`/`ploadu`, since `pload<const
Packet**>` are not defined.
This should unbreak the arm/ppc builds.
2021-10-25 19:09:50 +00:00
benardp
b63c096fbb
Extend EIGEN_QT_SUPPORT to Qt6
2021-10-23 23:43:06 +00:00
Lennart Steffen
163f11e24a
Included note on inner stride for compile-time vectors. See https://gitlab.com/libeigen/eigen/-/issues/2355#note_711078126
2021-10-22 09:46:43 +00:00
Rasmus Munk Larsen
2d3fec8ff6
Add nan-propagation options to matrix and array plugins.
2021-10-21 19:40:11 +00:00
Antonio Sanchez
b86e013321
Revert bit_cast to use memcpy for CUDA.
...
To elide the memcpy, we need to first load the `src` value into
registers by making a local copy. This avoids the need to resort
to potential UB by using `reinterpret_cast`.
This change doesn't seem to affect CPU (at least not with gcc/clang).
With optimizations on, the copy is also elided.
2021-10-21 08:14:11 -07:00
Antonio Sanchez
45e67a6fda
Use reinterpret_cast on GPU for bit_cast.
...
This seems to be the recommended approach for doing type punning in
CUDA. See for example
- https://stackoverflow.com/questions/47037104/cuda-type-punning-memcpy-vs-ub-union
- https://developer.nvidia.com/blog/faster-parallel-reductions-kepler/
(the latter puns a double to an `int2`).
The issue is that for CUDA, the `memcpy` is not elided, and ends up
being an expensive operation. We already have similar `reintepret_cast`s across
the Eigen codebase for GPU (as does TensorFlow).
2021-10-20 21:34:40 +00:00
Antonio Sanchez
95bb645e92
Fix MSVC+NVCC EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR compilation.
...
Looks like we need to update the
`EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR` for newer versions of MSVC as
well when compiling with NVCC. Fixes build issues for VS 2017.
2021-10-20 19:38:14 +00:00