11489 Commits

Author SHA1 Message Date
Adam Kallai
277d369060 win: include intrin header in Windows on ARM
intrin header is needed for _BitScanReverse and
_BitScanReverse64


(cherry picked from commit 1415817d8daa7fa72ec9b26a6b9d166a1d54626a)
2021-08-31 21:22:37 +00:00
Antonio Sanchez
7aee90b8d3 Fix fix<N> when variable templates are not supported.
There were some typos that checked `EIGEN_HAS_CXX14` that should have
checked `EIGEN_HAS_CXX14_VARIABLE_TEMPLATES`, causing a mismatch
in some of the `Eigen::fix<N>` assumptions.

Also fixed the `symbolic_index` test when
`EIGEN_HAS_CXX14_VARIABLE_TEMPLATES` is 0.

Fixes #2308


(cherry picked from commit 5db9e5c77958997856ddbccfa4a52ff22e83bef9)
2021-08-30 16:23:35 +00:00
Rasmus Munk Larsen
3147391d94 Change version to 3.4.0. 3.4.0 2021-08-18 13:41:58 -07:00
Antonio Sanchez
115591b9e3 Workaround VS 2017 arg bug.
In VS 2017, `std::arg` for real inputs always returns 0, even for
negative inputs.  It should return `PI` for negative real values.
This seems to be fixed in VS 2019 (MSVC 1920).


(cherry picked from commit 2b410ecbefea1bf4b9d50decb946a4ebe4a73f98)
2021-08-18 19:04:50 +00:00
Antonio Sanchez
fd100138dd Remove unaligned assert tests.
Manually constructing an unaligned object declared as aligned
invokes UB, so we cannot technically check for alignment from
within the constructor.  Newer versions of clang optimize away
this check.

Removing the affected tests.


(cherry picked from commit 0c4ae56e3797cc6719a8d08a0dafad0a5139a5f9)
2021-08-18 18:39:04 +00:00
Jakob Struye
1ec173b54e Clearer doc for squaredNorm
(cherry picked from commit 53a29c7e351646efe31ee85666c8f268f8e0d462)
2021-08-18 15:12:36 +00:00
Antonio Sanchez
aef926abf6 Renamed shift_left/shift_right to shiftLeft/shiftRight.
For naming consistency.  Also moved to ArrayCwiseUnaryOps, and added
test.


(cherry picked from commit fc9d352432b81210f73d71caecbd7dc5505d6ab8)
2021-08-18 14:44:31 +00:00
Antonio Sanchez
f1032255d3 Add missing PPC packet comparisons.
This is to fix the packetmath tests on the ppc pipeline.


(cherry picked from commit 2cc6ee0d2e76e88fe1476f6b0eae12edb68b1c8a)
2021-08-17 15:33:55 +00:00
Chip-Kerchner
f57dec64ef Fix unaligned loads in ploadLhs & ploadRhs for P8.
(cherry picked from commit 8dcf3e38ba9913021ce6a831836a59217e21baf2)
2021-08-17 12:48:36 +00:00
Rasmus Munk Larsen
926e1a8226 Update documentation for matrix decompositions and least squares solvers.
(cherry picked from commit 7e6f94961cb4444d3c20660d8cc492d28ada1415)
2021-08-16 22:11:38 +00:00
andiwand
cd474d4cd0 minor doc fix in Map.h
(cherry picked from commit 5c6b3efead69636dec1599aa54dab4617755013c)
2021-08-16 14:26:39 +00:00
Chip-Kerchner
0b56b62f30 Reverse compare logic ƒin F32ToBf16 since vec_cmpne is not available in Power8 - now compiles for clang10 default (P8).
(cherry picked from commit e07227c411cb5ed5c6252b594fe841867bd19f6a)
2021-08-13 18:01:15 +00:00
Chip Kerchner
44cc96e1a1 Get rid of used uninitialized warnings for EIGEN_UNUSED_VARIABLE in gcc11+
(cherry picked from commit 66499f0f172d0758360043e9c578761c0f7d50cd)
2021-08-12 21:39:17 +00:00
Rasmus Munk Larsen
576e451b10 Add CompleteOrthogonalDecomposition to the table of linear algeba decompositions.
(cherry picked from commit 96e3b4fc957834ad6736f7455c263d3a4158dc37)
2021-08-12 16:49:40 +00:00
Antonio Sanchez
0d89012708 Update code snippet for tridiagonalize_inplace.
(cherry picked from commit fb1718ad14485ccf733d90807253e47c1f72e275)
2021-08-12 15:37:32 +00:00
Rasmus Munk Larsen
6d2506040c * revise the meta_least_common_multiple function template, add a bool variable to check whether the A is larger than B.
* This can make less compile_time if A is smaller than B. and avoid failure in compile if we get a little A and a great B.

Authored by @awoniu.

(cherry picked from commit 8ce341caf2947e4b5ac4580c20254ae7d828b009)
2021-08-11 18:11:26 +00:00
Nikolay Tverdokhleb
cb44a003de Do not set AnnoyingScalar::dont_throw if not defined EIGEN_TEST_ANNOYING_SCALAR_DONT_THROW.
- Because that member is not declared if the macro is defined.


(cherry picked from commit f1b899eef7461e1475469b733346c6ebbfae8818)
2021-08-11 16:39:44 +00:00
ChipKerchner
13d7658c5d Fix errors on older compilers (gcc 7.5 - lack of vec_neg, clang10 - can not use const pointers with vec_xl).
(cherry picked from commit 413bc491f1721afdb9802553b13a5b7aba67ed3b)
2021-08-10 20:40:54 +00:00
jenswehner
338924602d added includes for unordered_map
(cherry picked from commit e3e74001f7c4bf95f0dde572e8a08c5b2918a3ab)
2021-08-10 16:10:03 +00:00
Gauri Deshpande
93bff85a42 remove denormal flushing in fp32tobf16 for avx & avx512
(cherry picked from commit e6a5a594a7f3cbe2f9843d4ef57a10d478cbb818)
2021-08-09 22:15:42 +00:00
Rasmus Munk Larsen
4e0357c6dd Avoid memory allocation in tridiagonalization_inplace_selector::run.
(cherry picked from commit a5a7faeb455efd7f6edb1138eda2e37546039b7d)
2021-08-06 21:48:00 +00:00
Daniel N. Miller (APD)
1e9f623f3e Do not build shared libs if not supported
(cherry picked from commit 09d7122468fb9b9adf813cf32167ab212511c4d8)
2021-08-06 21:47:37 +00:00
Jens Wehner
4240b480e0 updated documentation for middleCol and middleRow
(cherry picked from commit 4d870c49b7f1b49e34e8044dc6c1131d43e91a44)
2021-08-05 17:53:36 +00:00
Antonio Sanchez
5b83d3c4bc Make inverse 3x3 faster and avoid gcc bug.
There seems to be a gcc 4.7 bug that incorrectly flags the current
3x3 inverse as using uninitialized memory.  I'm *pretty* sure it's
a false positive, but it's hard to trigger.  The same warning
does not trigger with clang or later compiler versions.

In trying to find a work-around, this implementation turns out to be
faster anyways for static-sized matrices.

```
name                                            old cpu/op  new cpu/op  delta
BM_Inverse3x3<DynamicMatrix3T<float>>            423ns ± 2%   433ns ± 3%   +2.32%    (p=0.000 n=98+96)
BM_Inverse3x3<DynamicMatrix3T<double>>           425ns ± 2%   427ns ± 3%   +0.48%    (p=0.003 n=99+96)
BM_Inverse3x3<StaticMatrix3T<float>>            7.10ns ± 2%  0.80ns ± 1%  -88.67%  (p=0.000 n=114+112)
BM_Inverse3x3<StaticMatrix3T<double>>           7.45ns ± 2%  1.34ns ± 1%  -82.01%  (p=0.000 n=105+111)
BM_AliasedInverse3x3<DynamicMatrix3T<float>>     409ns ± 3%   419ns ± 3%   +2.40%   (p=0.000 n=100+98)
BM_AliasedInverse3x3<DynamicMatrix3T<double>>    414ns ± 3%   413ns ± 2%     ~       (p=0.322 n=98+98)
BM_AliasedInverse3x3<StaticMatrix3T<float>>     7.57ns ± 1%  0.80ns ± 1%  -89.37%  (p=0.000 n=111+114)
BM_AliasedInverse3x3<StaticMatrix3T<double>>    9.09ns ± 1%  2.58ns ±41%  -71.60%  (p=0.000 n=113+116)
```


(cherry picked from commit 5ad8b9bfe2bf75620bc89467c5cc051fc2a597df)
2021-08-04 22:06:52 +00:00
Antonio Sanchez
46ecdcd745 Fix MPReal detection and support.
The latest version of `mpreal` has a bug that breaks `min`/`max`.
It also breaks with the latest dev version of `mpfr`. Here we
add `FindMPREAL.cmake` which searches for the library and tests if
compilation works.

Removed our internal copy of `mpreal.h` under `unsupported/test`, as
it is out-of-sync with the latest, and similarly breaks with
the latest `mpfr`.  It would be best to use the installed version
of `mpreal` anyways, since that's what we actually want to test.

Fixes #2282.


(cherry picked from commit 31f796ebef35eeadd0e26878aab3fe99ca412a45)
2021-08-03 18:13:12 +00:00
Antonio Sanchez
9a1691a14e Fix cmake warnings, FindPASTIX/FindPTSCOTCH.
We were getting a lot of warnings due to nested `find_package` calls
within `Find***.cmake` files.  The recommended approach is to use
[`find_dependency`](https://cmake.org/cmake/help/latest/module/CMakeFindDependencyMacro.html)
in package configuration files. I made this change for all instances.

Case mismatches between `Find<Package>.cmake` and calling
`find_package(<PACKAGE>`) also lead to warnings. Fixed for
`FindPASTIX.cmake` and `FindSCOTCH.cmake`.

`FindBLASEXT.cmake` was broken due to calling `find_package_handle_standard_args(BLAS ...)`.
The package name must match, otherwise the `find_package(BLASEXT)` falsely thinks
the package wasn't found.  I changed to `BLASEXT`, but then also copied that value
to `BLAS_FOUND` for compatibility.

`FindPastix.cmake` had a typo that incorrectly added `PTSCOTCH` when looking for
the `SCOTCH` component.

`FindPTSCOTCH` incorrectly added `***-NOTFOUND` to include/library lists,
corrupting them.  This led to cmake errors down-the-line.

Fixes #2288.


(cherry picked from commit 1cdec386530c6b844389b96c199e723a1e4e71c7)
2021-08-03 17:48:20 +00:00
Antonio Sanchez
bb33880e57 Fix TriSycl CMake files.
This is to enable compiling with the latest trisycl. `FindTriSYCL.cmake` was
broken by commit 00f32752, which modified `add_sycl_to_target` for ComputeCPP.
This makes the corresponding modifications for trisycl to make them consistent.

Also, trisycl now requires c++17.


(cherry picked from commit 8cf6cb27baa9607cc00e5dbb42a1c31efda41b74)
2021-08-03 17:25:17 +00:00
Antonio Sanchez
237c59a2aa Modify scalar pzero, ptrue, pselect, and p<binary> operations to avoid memset.
The `memset` function and bitwise manipulation only apply to POD types
that do not require initialization, otherwise resulting in UB. We currently
violate this in `ptrue` and `pzero`, we assume bitmasks for `pselect`, and
bitwise operations are applied byte-by-byte in the generic implementations.

This is causing issues for scalar types that do require initialization
or that contain non-POD info such as pointers (#2201). We either break
them, or force specializations of these functions for custom scalars,
even if they are not vectorized.

Here we modify these functions for scalars only - instead using only
scalar operations:
- `pzero`: `Scalar(0)` for all scalars.
- `ptrue`: `Scalar(1)` for non-trivial scalars, bitset to one bits for trivial scalars.
- `pselect`: ternary select comparing mask to `Scalar(0)` for all scalars
- `pand`, `por`, `pxor`, `pnot`: use operators `&`, `|`, `^`, `~` for all integer or non-trivial scalars, otherwise apply bytewise.

For non-scalar types, the original implementations are used to maintain
compatibility and minimize the number of changes.

Fixes #2201.


(cherry picked from commit 3d98a6ef5ce0ba85acaee4ffffc53f0f21bd8fd2)
2021-08-03 16:32:59 +00:00
Antonio Sanchez
3dc42eeaec Enable equality comparisons on GPU.
Since `std::equal_to::operator()` is not a device function, it
fails on GPU.  On my device, I seem to get a silent crash in the
kernel (no reported error, but the kernel does not complete).

Replacing this with a portable version enables comparisons on device.

Addresses #2292 - would need to be cherry-picked.  The 3.3 branch
also requires adding `EIGEN_DEVICE_FUNC` in `BooleanRedux.h` to get
fully working.


(cherry picked from commit 7880f10526a11dc5544426c54c5763de576bf285)
2021-08-03 16:15:44 +00:00
hyunggi-sv
7adc1545b4 fix:typo in dox (has->have)
(cherry picked from commit 02a0e79c701da7aa8dfad79b13cd1e7fae46d634)
2021-08-03 00:54:41 +00:00
Antonio Sanchez
c0c7b695cd Fix assignment operator issue for latest MSVC+NVCC.
Details are scattered across #920, #1000, #1324, #2291.

Summary: some MSVC versions have a bug that requires omitting explicit
`operator=` definitions (leads to duplicate definition errors), and
some MSVC versions require adding explicit `operator=` definitions
(otherwise implicitly deleted errors).  This mess tries to cover
all the cases encountered.

Fixes #2291.


(cherry picked from commit 9816fe59b47dc4c07967b5ee93a8e8aaa6e9c308)
2021-08-03 00:52:21 +00:00
Alexander Karatarakis
c334eece44 _DerType -> DerivativeType as underscore-followed-by-caps is a reserved identifier
(cherry picked from commit f357283d3128a6253af09705155ce4f9f113e3c8)
2021-07-29 18:18:47 +00:00
Jonas Harsch
5ccb72b2e4 Fixed typo in TutorialSparse.dox
(cherry picked from commit 5b81764c0f4e06ff12a0c769b1bd876b10ad7502)
2021-07-26 14:33:10 +00:00
arthurfeeney
9c90d5d832 Fixes #1387 for compilation error in JacobiSVD with HouseholderQRPreconditioner that occurs when input is a compile-time row vector.
(cherry picked from commit a77638387dd1aa2d07d2dae240cc30b303b4ef38)
2021-07-22 18:01:55 +00:00
Antonio Sanchez
5d37114fc0 Fix explicit default cache size typo.
(cherry picked from commit 297f0f563d916260665d7fadc017f94f1a5e7a03)
2021-07-20 18:42:25 +00:00
Rohit Santhanam
930696fc53 Enable extract et. al. for HIP GPU.
(cherry picked from commit beea14a18f76817439b4d8901d29db2e9c4a24c8)
2021-07-09 16:14:19 +00:00
Rasmus Munk Larsen
56966fd2e6 Defer to std::fill_n when filling a dense object with a constant value.
(cherry picked from commit 0c361c4899c9042d2b25cd60d7826ab464caacb7)
2021-07-09 03:59:56 +00:00
Jonas Harsch
5a3c9eddb4 Removed superfluous boolean degenerate in TensorMorphing.h.
(cherry picked from commit e9c9a3130b7307a240335aa527a6d4c5fb2ee471)
2021-07-08 18:34:10 +00:00
Guoqiang QI
69ec4907da Make a copy of input matrix when try to do the inverse in place, this fixes #2285.
(cherry picked from commit 4bcd42c271761dc5341f8e08ca7d357c3614cb01)
2021-07-08 17:07:54 +00:00
Antonio Sanchez
7571704a43 Fix CMake directory issues.
Allows absolute and relative paths for
- `INCLUDE_INSTALL_DIR`
- `CMAKEPACKAGE_INSTALL_DIR`
- `PKGCONFIG_INSTALL_DIR`

Type should be `PATH` not `STRING`.  Contrary to !211, these don't
seem to be made absolute if user-defined - according to the doc any
directories should use `PATH` type, which allows a file dialog
to be used via the GUI.  It also better handles file separators.

If user provides an absolute path, it will be made relative to
`CMAKE_INSTALL_PREFIX` so that the `configure_packet_config_file` will
work.

Fixes #2155 and #2269.


(cherry picked from commit f44f05532decf830fcdb07e2a67a2fa4ccbc3870)
2021-07-07 17:44:00 +00:00
Antonio Sanchez
84955d109f Fix Tensor documentation page.
The extra [TOC] tag is generating a huge floating duplicated
table-of-contents, which obscures the majority of the page
(see bottom of https://eigen.tuxfamily.org/dox/unsupported/eigen_tensors.html).
Remove it.

Also, headers do not support markup (see
[doxygen bug](https://github.com/doxygen/doxygen/issues/7467)), so
backticks like
```
```
end up generating titles that looks like
```
Constructor <tt>Tensor<double,2></tt>
```
Removing backticks for now.  To generate proper formatted headers, we
must directly use html instead of markdown, i.e.
```
<h2>Constructor <code>Tensor&lt;double,2&gt;</code></h2>
```
which is ugly.

Fixes #2254.


(cherry picked from commit f5a9873bbb5488bcba3e37f92b4ec09a8db76081)
2021-07-07 17:18:20 +00:00
Jonas Harsch
601814b575 Don't crash when attempting to shuffle an empty tensor.
(cherry picked from commit aab747021be5ed1a1e9667243d884eb72003599d)
2021-07-02 21:08:38 +00:00
Rasmus Munk Larsen
05bab8139a Fix breakage of conj_helper in conjunction with custom types introduced in !537.
(cherry picked from commit 7b35638ddb99a0298c5d3450de506a8e8e0203d3)
2021-07-02 20:59:50 +00:00
Chip Kerchner
eebde572d9 Create the ability to disable the specialized gemm_pack_rhs in Eigen (only PPC) for TensorFlow
(cherry picked from commit 91e99ec1e02100d07e35a7abb1b5c76707237219)
2021-07-01 23:32:38 +00:00
Antonio Sanchez
8190739f12 Fix compile issues for gcc 4.8.
- Move constructors can only be defaulted as NOEXCEPT if all members
have NOEXCEPT move constructors.
- gcc 4.8 has some funny parsing bug in `a < b->c`, thinking `b-` is a template parameter.


(cherry picked from commit 6035da5283f12f7e6a49cda0c21696c8e5a115b7)
2021-07-01 23:18:10 +00:00
Antonio Sanchez
b6db013435 Fix inverse nullptr/asan errors for LU.
For empty or single-column matrices, the current `PartialPivLU`
currently dereferences a `nullptr` or accesses memory out-of-bounds.
Here we adjust the checks to avoid this.


(cherry picked from commit 154f00e9eacaec5667215784c7601b55024e2f61)
2021-07-01 22:57:25 +00:00
Dan Miller
1f6b1c1a1f Fix duplicate definitions on Mac
(cherry picked from commit eb047759030558acf0764d5d2f913f4f84cf85a8)
2021-07-01 20:49:05 +00:00
Alexander Karatarakis
517294d6e1 Make DenseStorage<> trivially_copyable
(cherry picked from commit 60400334a92268272c6bf525da89eec5e99c3e5a)
2021-07-01 20:48:47 +00:00
大河メタル
94e2250b36 Correct declarations for aarch64-pc-windows-msvc
(cherry picked from commit c81da59a252b3479753b2eada26ee0cf46280bd0)
2021-06-30 04:10:04 +00:00
Antonio Sanchez
d82d915047 Modify tensor argmin/argmax to always return first occurence.
As written, depending on multithreading/gpu, the returned index from
`argmin`/`argmax` is not currently stable.  Here we modify the functors
to always keep the first occurence (i.e. if the value is equal to the
current min/max, then keep the one with the smallest index).

This is otherwise causing unpredictable results in some TF tests.


(cherry picked from commit 3a087ccb99b454dc34484333e608e836e7032213)
2021-06-29 23:28:37 +00:00