11476 Commits

Author SHA1 Message Date
Rasmus Munk Larsen
576e451b10 Add CompleteOrthogonalDecomposition to the table of linear algeba decompositions.
(cherry picked from commit 96e3b4fc957834ad6736f7455c263d3a4158dc37)
2021-08-12 16:49:40 +00:00
Antonio Sanchez
0d89012708 Update code snippet for tridiagonalize_inplace.
(cherry picked from commit fb1718ad14485ccf733d90807253e47c1f72e275)
2021-08-12 15:37:32 +00:00
Rasmus Munk Larsen
6d2506040c * revise the meta_least_common_multiple function template, add a bool variable to check whether the A is larger than B.
* This can make less compile_time if A is smaller than B. and avoid failure in compile if we get a little A and a great B.

Authored by @awoniu.

(cherry picked from commit 8ce341caf2947e4b5ac4580c20254ae7d828b009)
2021-08-11 18:11:26 +00:00
Nikolay Tverdokhleb
cb44a003de Do not set AnnoyingScalar::dont_throw if not defined EIGEN_TEST_ANNOYING_SCALAR_DONT_THROW.
- Because that member is not declared if the macro is defined.


(cherry picked from commit f1b899eef7461e1475469b733346c6ebbfae8818)
2021-08-11 16:39:44 +00:00
ChipKerchner
13d7658c5d Fix errors on older compilers (gcc 7.5 - lack of vec_neg, clang10 - can not use const pointers with vec_xl).
(cherry picked from commit 413bc491f1721afdb9802553b13a5b7aba67ed3b)
2021-08-10 20:40:54 +00:00
jenswehner
338924602d added includes for unordered_map
(cherry picked from commit e3e74001f7c4bf95f0dde572e8a08c5b2918a3ab)
2021-08-10 16:10:03 +00:00
Gauri Deshpande
93bff85a42 remove denormal flushing in fp32tobf16 for avx & avx512
(cherry picked from commit e6a5a594a7f3cbe2f9843d4ef57a10d478cbb818)
2021-08-09 22:15:42 +00:00
Rasmus Munk Larsen
4e0357c6dd Avoid memory allocation in tridiagonalization_inplace_selector::run.
(cherry picked from commit a5a7faeb455efd7f6edb1138eda2e37546039b7d)
2021-08-06 21:48:00 +00:00
Daniel N. Miller (APD)
1e9f623f3e Do not build shared libs if not supported
(cherry picked from commit 09d7122468fb9b9adf813cf32167ab212511c4d8)
2021-08-06 21:47:37 +00:00
Jens Wehner
4240b480e0 updated documentation for middleCol and middleRow
(cherry picked from commit 4d870c49b7f1b49e34e8044dc6c1131d43e91a44)
2021-08-05 17:53:36 +00:00
Antonio Sanchez
5b83d3c4bc Make inverse 3x3 faster and avoid gcc bug.
There seems to be a gcc 4.7 bug that incorrectly flags the current
3x3 inverse as using uninitialized memory.  I'm *pretty* sure it's
a false positive, but it's hard to trigger.  The same warning
does not trigger with clang or later compiler versions.

In trying to find a work-around, this implementation turns out to be
faster anyways for static-sized matrices.

```
name                                            old cpu/op  new cpu/op  delta
BM_Inverse3x3<DynamicMatrix3T<float>>            423ns ± 2%   433ns ± 3%   +2.32%    (p=0.000 n=98+96)
BM_Inverse3x3<DynamicMatrix3T<double>>           425ns ± 2%   427ns ± 3%   +0.48%    (p=0.003 n=99+96)
BM_Inverse3x3<StaticMatrix3T<float>>            7.10ns ± 2%  0.80ns ± 1%  -88.67%  (p=0.000 n=114+112)
BM_Inverse3x3<StaticMatrix3T<double>>           7.45ns ± 2%  1.34ns ± 1%  -82.01%  (p=0.000 n=105+111)
BM_AliasedInverse3x3<DynamicMatrix3T<float>>     409ns ± 3%   419ns ± 3%   +2.40%   (p=0.000 n=100+98)
BM_AliasedInverse3x3<DynamicMatrix3T<double>>    414ns ± 3%   413ns ± 2%     ~       (p=0.322 n=98+98)
BM_AliasedInverse3x3<StaticMatrix3T<float>>     7.57ns ± 1%  0.80ns ± 1%  -89.37%  (p=0.000 n=111+114)
BM_AliasedInverse3x3<StaticMatrix3T<double>>    9.09ns ± 1%  2.58ns ±41%  -71.60%  (p=0.000 n=113+116)
```


(cherry picked from commit 5ad8b9bfe2bf75620bc89467c5cc051fc2a597df)
2021-08-04 22:06:52 +00:00
Antonio Sanchez
46ecdcd745 Fix MPReal detection and support.
The latest version of `mpreal` has a bug that breaks `min`/`max`.
It also breaks with the latest dev version of `mpfr`. Here we
add `FindMPREAL.cmake` which searches for the library and tests if
compilation works.

Removed our internal copy of `mpreal.h` under `unsupported/test`, as
it is out-of-sync with the latest, and similarly breaks with
the latest `mpfr`.  It would be best to use the installed version
of `mpreal` anyways, since that's what we actually want to test.

Fixes #2282.


(cherry picked from commit 31f796ebef35eeadd0e26878aab3fe99ca412a45)
2021-08-03 18:13:12 +00:00
Antonio Sanchez
9a1691a14e Fix cmake warnings, FindPASTIX/FindPTSCOTCH.
We were getting a lot of warnings due to nested `find_package` calls
within `Find***.cmake` files.  The recommended approach is to use
[`find_dependency`](https://cmake.org/cmake/help/latest/module/CMakeFindDependencyMacro.html)
in package configuration files. I made this change for all instances.

Case mismatches between `Find<Package>.cmake` and calling
`find_package(<PACKAGE>`) also lead to warnings. Fixed for
`FindPASTIX.cmake` and `FindSCOTCH.cmake`.

`FindBLASEXT.cmake` was broken due to calling `find_package_handle_standard_args(BLAS ...)`.
The package name must match, otherwise the `find_package(BLASEXT)` falsely thinks
the package wasn't found.  I changed to `BLASEXT`, but then also copied that value
to `BLAS_FOUND` for compatibility.

`FindPastix.cmake` had a typo that incorrectly added `PTSCOTCH` when looking for
the `SCOTCH` component.

`FindPTSCOTCH` incorrectly added `***-NOTFOUND` to include/library lists,
corrupting them.  This led to cmake errors down-the-line.

Fixes #2288.


(cherry picked from commit 1cdec386530c6b844389b96c199e723a1e4e71c7)
2021-08-03 17:48:20 +00:00
Antonio Sanchez
bb33880e57 Fix TriSycl CMake files.
This is to enable compiling with the latest trisycl. `FindTriSYCL.cmake` was
broken by commit 00f32752, which modified `add_sycl_to_target` for ComputeCPP.
This makes the corresponding modifications for trisycl to make them consistent.

Also, trisycl now requires c++17.


(cherry picked from commit 8cf6cb27baa9607cc00e5dbb42a1c31efda41b74)
2021-08-03 17:25:17 +00:00
Antonio Sanchez
237c59a2aa Modify scalar pzero, ptrue, pselect, and p<binary> operations to avoid memset.
The `memset` function and bitwise manipulation only apply to POD types
that do not require initialization, otherwise resulting in UB. We currently
violate this in `ptrue` and `pzero`, we assume bitmasks for `pselect`, and
bitwise operations are applied byte-by-byte in the generic implementations.

This is causing issues for scalar types that do require initialization
or that contain non-POD info such as pointers (#2201). We either break
them, or force specializations of these functions for custom scalars,
even if they are not vectorized.

Here we modify these functions for scalars only - instead using only
scalar operations:
- `pzero`: `Scalar(0)` for all scalars.
- `ptrue`: `Scalar(1)` for non-trivial scalars, bitset to one bits for trivial scalars.
- `pselect`: ternary select comparing mask to `Scalar(0)` for all scalars
- `pand`, `por`, `pxor`, `pnot`: use operators `&`, `|`, `^`, `~` for all integer or non-trivial scalars, otherwise apply bytewise.

For non-scalar types, the original implementations are used to maintain
compatibility and minimize the number of changes.

Fixes #2201.


(cherry picked from commit 3d98a6ef5ce0ba85acaee4ffffc53f0f21bd8fd2)
2021-08-03 16:32:59 +00:00
Antonio Sanchez
3dc42eeaec Enable equality comparisons on GPU.
Since `std::equal_to::operator()` is not a device function, it
fails on GPU.  On my device, I seem to get a silent crash in the
kernel (no reported error, but the kernel does not complete).

Replacing this with a portable version enables comparisons on device.

Addresses #2292 - would need to be cherry-picked.  The 3.3 branch
also requires adding `EIGEN_DEVICE_FUNC` in `BooleanRedux.h` to get
fully working.


(cherry picked from commit 7880f10526a11dc5544426c54c5763de576bf285)
2021-08-03 16:15:44 +00:00
hyunggi-sv
7adc1545b4 fix:typo in dox (has->have)
(cherry picked from commit 02a0e79c701da7aa8dfad79b13cd1e7fae46d634)
2021-08-03 00:54:41 +00:00
Antonio Sanchez
c0c7b695cd Fix assignment operator issue for latest MSVC+NVCC.
Details are scattered across #920, #1000, #1324, #2291.

Summary: some MSVC versions have a bug that requires omitting explicit
`operator=` definitions (leads to duplicate definition errors), and
some MSVC versions require adding explicit `operator=` definitions
(otherwise implicitly deleted errors).  This mess tries to cover
all the cases encountered.

Fixes #2291.


(cherry picked from commit 9816fe59b47dc4c07967b5ee93a8e8aaa6e9c308)
2021-08-03 00:52:21 +00:00
Alexander Karatarakis
c334eece44 _DerType -> DerivativeType as underscore-followed-by-caps is a reserved identifier
(cherry picked from commit f357283d3128a6253af09705155ce4f9f113e3c8)
2021-07-29 18:18:47 +00:00
Jonas Harsch
5ccb72b2e4 Fixed typo in TutorialSparse.dox
(cherry picked from commit 5b81764c0f4e06ff12a0c769b1bd876b10ad7502)
2021-07-26 14:33:10 +00:00
arthurfeeney
9c90d5d832 Fixes #1387 for compilation error in JacobiSVD with HouseholderQRPreconditioner that occurs when input is a compile-time row vector.
(cherry picked from commit a77638387dd1aa2d07d2dae240cc30b303b4ef38)
2021-07-22 18:01:55 +00:00
Antonio Sanchez
5d37114fc0 Fix explicit default cache size typo.
(cherry picked from commit 297f0f563d916260665d7fadc017f94f1a5e7a03)
2021-07-20 18:42:25 +00:00
Rohit Santhanam
930696fc53 Enable extract et. al. for HIP GPU.
(cherry picked from commit beea14a18f76817439b4d8901d29db2e9c4a24c8)
2021-07-09 16:14:19 +00:00
Rasmus Munk Larsen
56966fd2e6 Defer to std::fill_n when filling a dense object with a constant value.
(cherry picked from commit 0c361c4899c9042d2b25cd60d7826ab464caacb7)
2021-07-09 03:59:56 +00:00
Jonas Harsch
5a3c9eddb4 Removed superfluous boolean degenerate in TensorMorphing.h.
(cherry picked from commit e9c9a3130b7307a240335aa527a6d4c5fb2ee471)
2021-07-08 18:34:10 +00:00
Guoqiang QI
69ec4907da Make a copy of input matrix when try to do the inverse in place, this fixes #2285.
(cherry picked from commit 4bcd42c271761dc5341f8e08ca7d357c3614cb01)
2021-07-08 17:07:54 +00:00
Antonio Sanchez
7571704a43 Fix CMake directory issues.
Allows absolute and relative paths for
- `INCLUDE_INSTALL_DIR`
- `CMAKEPACKAGE_INSTALL_DIR`
- `PKGCONFIG_INSTALL_DIR`

Type should be `PATH` not `STRING`.  Contrary to !211, these don't
seem to be made absolute if user-defined - according to the doc any
directories should use `PATH` type, which allows a file dialog
to be used via the GUI.  It also better handles file separators.

If user provides an absolute path, it will be made relative to
`CMAKE_INSTALL_PREFIX` so that the `configure_packet_config_file` will
work.

Fixes #2155 and #2269.


(cherry picked from commit f44f05532decf830fcdb07e2a67a2fa4ccbc3870)
2021-07-07 17:44:00 +00:00
Antonio Sanchez
84955d109f Fix Tensor documentation page.
The extra [TOC] tag is generating a huge floating duplicated
table-of-contents, which obscures the majority of the page
(see bottom of https://eigen.tuxfamily.org/dox/unsupported/eigen_tensors.html).
Remove it.

Also, headers do not support markup (see
[doxygen bug](https://github.com/doxygen/doxygen/issues/7467)), so
backticks like
```
```
end up generating titles that looks like
```
Constructor <tt>Tensor<double,2></tt>
```
Removing backticks for now.  To generate proper formatted headers, we
must directly use html instead of markdown, i.e.
```
<h2>Constructor <code>Tensor&lt;double,2&gt;</code></h2>
```
which is ugly.

Fixes #2254.


(cherry picked from commit f5a9873bbb5488bcba3e37f92b4ec09a8db76081)
2021-07-07 17:18:20 +00:00
Jonas Harsch
601814b575 Don't crash when attempting to shuffle an empty tensor.
(cherry picked from commit aab747021be5ed1a1e9667243d884eb72003599d)
2021-07-02 21:08:38 +00:00
Rasmus Munk Larsen
05bab8139a Fix breakage of conj_helper in conjunction with custom types introduced in !537.
(cherry picked from commit 7b35638ddb99a0298c5d3450de506a8e8e0203d3)
2021-07-02 20:59:50 +00:00
Chip Kerchner
eebde572d9 Create the ability to disable the specialized gemm_pack_rhs in Eigen (only PPC) for TensorFlow
(cherry picked from commit 91e99ec1e02100d07e35a7abb1b5c76707237219)
2021-07-01 23:32:38 +00:00
Antonio Sanchez
8190739f12 Fix compile issues for gcc 4.8.
- Move constructors can only be defaulted as NOEXCEPT if all members
have NOEXCEPT move constructors.
- gcc 4.8 has some funny parsing bug in `a < b->c`, thinking `b-` is a template parameter.


(cherry picked from commit 6035da5283f12f7e6a49cda0c21696c8e5a115b7)
2021-07-01 23:18:10 +00:00
Antonio Sanchez
b6db013435 Fix inverse nullptr/asan errors for LU.
For empty or single-column matrices, the current `PartialPivLU`
currently dereferences a `nullptr` or accesses memory out-of-bounds.
Here we adjust the checks to avoid this.


(cherry picked from commit 154f00e9eacaec5667215784c7601b55024e2f61)
2021-07-01 22:57:25 +00:00
Dan Miller
1f6b1c1a1f Fix duplicate definitions on Mac
(cherry picked from commit eb047759030558acf0764d5d2f913f4f84cf85a8)
2021-07-01 20:49:05 +00:00
Alexander Karatarakis
517294d6e1 Make DenseStorage<> trivially_copyable
(cherry picked from commit 60400334a92268272c6bf525da89eec5e99c3e5a)
2021-07-01 20:48:47 +00:00
大河メタル
94e2250b36 Correct declarations for aarch64-pc-windows-msvc
(cherry picked from commit c81da59a252b3479753b2eada26ee0cf46280bd0)
2021-06-30 04:10:04 +00:00
Antonio Sanchez
d82d915047 Modify tensor argmin/argmax to always return first occurence.
As written, depending on multithreading/gpu, the returned index from
`argmin`/`argmax` is not currently stable.  Here we modify the functors
to always keep the first occurence (i.e. if the value is equal to the
current min/max, then keep the one with the smallest index).

This is otherwise causing unpredictable results in some TF tests.


(cherry picked from commit 3a087ccb99b454dc34484333e608e836e7032213)
2021-06-29 23:28:37 +00:00
Rasmus Munk Larsen
380d0e4916 Get rid of redundant pabs instruction in complex square root.
(cherry picked from commit 5aebbe9098f53f01c99eed67b52725397e955280)
2021-06-29 23:27:09 +00:00
Rohit Santhanam
e83af2cc24 Commit 52a5f982 broke conjhelper functionality for HIP GPUs.
This commit addresses this.


(cherry picked from commit 2d132d17365ffc84c0cc7a7da9b8f7090e94b476)
2021-06-25 19:56:18 +00:00
Rasmus Munk Larsen
413ff2b531 Small cleanup: Get rid of the macros EIGEN_HAS_SINGLE_INSTRUCTION_CJMADD and CJMADD, which were effectively unused, apart from on x86, where the change results in identically performing code.
(cherry picked from commit bffd267d176410a517a0fe9afa6dde99c213c08a)
2021-06-25 17:13:12 +00:00
Rasmus Munk Larsen
a235ddef39 Get rid of code duplication for conj_helper. For packets where LhsType=RhsType a single generic implementation suffices. For scalars, the generic implementation of pconj automatically forwards to numext::conj, so much of the existing specialization can be avoided. For mixed types we still need specializations.
(cherry picked from commit 52a5f9821235e5a9f7e9b3e0198d45d42a1cb267)
2021-06-24 23:30:42 +00:00
Rasmus Munk Larsen
4780d8dfb2 Fix typo in SelfAdjointEigenSolver_eigenvectors.cpp
(cherry picked from commit c8a2b4d20a162dc2527425f40cf7df95db5ba428)
2021-06-21 19:07:17 +00:00
Rasmus Munk Larsen
fd5d23fdf3 Update ComplexEigenSolver_eigenvectors.cpp
(cherry picked from commit ea62c937edcc2c5efdaccfb6813ca39f48564ece)
2021-06-21 19:06:54 +00:00
Antonio Sanchez
a2040ef796 Rewrite balancer to avoid overflows.
The previous balancer overflowed for large row/column norms.
Modified to prevent that.

Fixes #2273.


(cherry picked from commit e9ab4278b7aba6f279c964d99ae5a312d12ab04b)
2021-06-21 18:14:53 +00:00
Antonio Sanchez
c2c0f6f64b Fix fix<> for gcc-4.9.3.
There's a missing `EIGEN_HAS_CXX14` -> `EIGEN_HAS_CXX14_VARIABLE_TEMPLATES`
replacement.

Fixes ##2267


(cherry picked from commit 35a367d557078462a0793c88c44dcad64fc63698)
2021-06-21 17:26:07 +00:00
Antonio Sanchez
ee4e099aa2 Remove pset, replace with ploadu.
We can't make guarantees on alignment for existing calls to `pset`,
so we should default to loading unaligned.  But in that case, we should
just use `ploadu` directly. For loading constants, this load should hopefully
get optimized away.

This is causing segfaults in Google Maps.


(cherry picked from commit 12e8d57108c50d8a63605c6eb0144c838c128337)
2021-06-17 17:11:08 +00:00
Chip-Kerchner
9fc93ce31a EIGEN_STRONG_INLINE was NOT inlining in some critical needed areas (6.6X slowdown) when used with Tensorflow. Changing to EIGEN_ALWAYS_INLINE where appropiate.
(cherry picked from commit ef1fd341a895fda883f655102f371fa8b41f2088)
2021-06-16 22:14:17 +00:00
Antonio Sanchez
1374f49f28 Add missing ppc pcmp_lt_or_nan<Packet8bf>
(cherry picked from commit 9e94c5957000c38a6553552c96a7a27b1fc2860d)
2021-06-15 22:12:22 +00:00
Antonio Sanchez
2d6eaaf687 Fix placement of permanent GPU defines.
(cherry picked from commit 954879183b1e008d7f0fefb97e48a925c4e3fb16)
2021-06-15 19:18:20 +00:00
Rasmus Munk Larsen
47722a66f2 Fix more enum arithmetic.
(cherry picked from commit 13fb5ab92c3226f7b9be20882b0418d53516d35a)
2021-06-15 16:40:35 +00:00