2959 Commits

Author SHA1 Message Date
Charles Schlosser
25270e35db Fix compiler warnings in 3.4 2023-12-21 00:57:21 +00:00
Silvio Traversaro
fc5575264f Backport "disambiguate overloads for empty index list" to 3.4 branch 2023-11-10 04:03:11 +00:00
Antonio Sanchez
1217390db4 Fix windows+CUDA builds 2023-10-25 20:55:59 +00:00
Fabian Keßler
d0bfdc1658 optimize cmake scripts for subproject use
(cherry picked from commit 19cacd3ecb9dab73c2dd7bc39d9193e06ba92bdd)
2023-07-26 12:01:28 -07:00
Charles Schlosser
208e44c979 fix warnings in tensorreduction and memory 2023-07-19 16:48:07 +00:00
Antonio Sanchez
ac561cd038 Reduce tensor_contract_gpu test.
The original test times out after 60 minutes on Windows, even when
setting flags to optimize for speed.  Reducing the number of
contractions performed from 3600->27 for subtests 8,9 allow the
two to run in just over a minute each.

(cherry picked from commit be9e7d205f38e3e8effdfdded88817b371673930)
2023-07-11 11:27:31 -07:00
Antonio Sanchez
554982beef Disable Tree reduction for GPU.
For moderately sized inputs, running the Tree reduction quickly
fills/overflows the GPU thread stack space, leading to memory errors.
This was happening in the `cxx11_tensor_complex_gpu` test, for example.
Disabling tree reduction on GPU fixes this.

(cherry picked from commit 24ebb37f38287d65c0e0b60c714e39faffeb5b94)
2023-07-10 16:09:30 -07:00
Antonio Sanchez
89a71f3126 Fix gpu special function tests.
Some checks used incorrect values, partly from copy-paste errors,
partly from the change in behaviour introduced in !398.

Modified results to match scipy, simplified tests by updating
`VERIFY_IS_CWISE_APPROX` to work for scalars.

(cherry picked from commit 701f5d1c91c770e558c7760da14ff3365757e275)
2023-07-10 15:57:08 -07:00
Antonio Sanchez
a605d6b996 Rename EIGEN_CUDA_FLAGS to EIGEN_CUDA_CXX_FLAGS
Also add a missing space for clang.

(cherry picked from commit 846d34384af80b80793d32257a7f917eeece41d4)
2023-07-10 15:30:41 -07:00
Antonio Sanchez
dfcd6de20a Clean up CUDA CMake files.
- Unify test/CMakeLists.txt and unsupported/test/CMakeLists.txt
- Added `EIGEN_CUDA_FLAGS` that are appended to the set of flags passed
to the cuda compiler (nvcc or clang).

The latter is to support passing custom flags (e.g. `-arch=` to nvcc,
or to disable cuda-specific warnings).

(cherry picked from commit 7b00e8b186a7679b0f46be742809a55d07d4efe8)
2023-07-10 15:30:41 -07:00
Antonio Sánchez
26b8fabd80 Return NaN in ndtri for values outside valid input range.
(cherry picked from commit 1f79a6078fb77da47069c8aec23c4e309fb982e2)
2023-07-10 14:52:08 -07:00
Rasmus Munk Larsen
f296720d7d Make sure we return +/-1 above the clamping point for Erf().
(cherry picked from commit b378014fef017a829fb42c7fad15f3764bfb8ef9)
2023-07-10 14:52:08 -07:00
Rasmus Munk Larsen
d4c24eca96 Don't crash on empty tensor contraction.
(cherry picked from commit b0f877f8e01e90a5b0f3a79d46ea234899f8b499)
2023-07-10 14:52:08 -07:00
Antonio Sánchez
72b0759451 Fix arm builds.
(cherry picked from commit 2c8011c2dd72d6c2086b181aad8cbb6204fed5db)
2023-07-10 14:52:08 -07:00
Chip Kerchner
8f1b6198c2 Fix epsilon and dummy_precision values in long double for double doubles. Prevented some algorithms from converging on PPC.
(cherry picked from commit 54459214a1b9c67df04bc529474fca1ec9f4c84f)
2023-07-10 14:52:08 -07:00
Antonio Sánchez
669dc8fadf Eliminate bool bitwise warnings.
(cherry picked from commit b8e93bf589fa66da404c66c48dc512b3e7484713)
2023-07-07 15:21:17 -07:00
Antonio Sánchez
ea57f9b78f AutoDiff depends on Core, so include appropriate header.
(cherry picked from commit e1165dbf9a16527ab085bec2749b02096fa1b584)
2023-07-07 15:21:17 -07:00
Antonio Sánchez
f55a112cb1 Fix ODR violations.
(cherry picked from commit bb51d9f4fa3cf1114348b9180640d6da7d3964f9)
2023-07-07 15:21:17 -07:00
Antonio Sánchez
a11bdf3965 Skip f16/bf16 bessel specializations on AVX512 if unavailable.
(cherry picked from commit 8ed3b9dcd6dd2e58ec0ad27438d09a90c72e549a)
2023-07-07 15:21:17 -07:00
Antonio Sánchez
80c5b8b3c3 Fix ambiguous comparisons for c++20 (again again)
(cherry picked from commit 8c2e0e3cb8c6ddcd828d6f1d2062a243c0dc9948)
2023-07-07 15:21:17 -07:00
Antonio Sánchez
af912a7b5c Fix MSVC+CUDA issues.
(cherry picked from commit 5ed7a86ae96d411c450fb190f5a725f38f2aea9d)
2023-07-07 15:21:17 -07:00
Antonio Sanchez
ac78f84b72 Eliminate trace unused warning.
(cherry picked from commit 9bc9992dd37e0379be888186a234b7641af306f7)
2023-07-07 15:06:18 -07:00
Antonio Sánchez
b158fcaa74 Fix edge-case in zeta for large inputs.
(cherry picked from commit 9296bb4b933973365d19b4b71e7d2b205d00a1ad)
2023-07-07 15:06:18 -07:00
Antonio Sánchez
b30a2a527e Remove poor non-convergence checks in NonLinearOptimization.
(cherry picked from commit d819a33bf64c4fce95c55f8e44a68b486f064a79)
2023-07-07 11:50:25 -07:00
Antonio Sanchez
bc1b354b32 Adjust tolerance of matrix_power test for MSVC.
(cherry picked from commit 1c2690ed248327539f7a248ddb12e1690da81b68)
2023-07-07 11:50:02 -07:00
Antonio Sánchez
36be6747e0 Modify test expression to avoid numerical differences (#2402).
(cherry picked from commit ae86a146b1ac9a49bf72e485254c08d237fd094a)
2023-07-07 11:45:56 -07:00
Antonio Sanchez
21e0ad056e Fix ODR failures in TensorRandom.
(cherry picked from commit bded5028a5bd112181b94b2a246ac2c20e671c2f)
2023-07-07 11:43:03 -07:00
Antonio Sánchez
52e545324e Fix ODR violations.
(cherry picked from commit cafeadffef2a7ba41f2da5cf34c38068d74499eb)
2023-07-07 11:37:31 -07:00
Antonio Sánchez
f3aaba8705 Revert "Replace call to FixedDimensions() with a singleton instance of"
This reverts commit 19e6496ce0c52fef33265bca54285ba77b2155be

(cherry picked from commit f7b31f864c0dec7872038cab79f6e677de2ecc71)
2022-04-10 15:34:11 +00:00
Antonio Sanchez
7e3bc4177e Fix tensor broadcast off-by-one error.
Caught by JAX unit tests.  Triggered if broadcast is smaller than packet
size.


(cherry picked from commit ffb78e23a1b3bc232a07773144cfa5fa1759852d)
2021-11-16 18:41:25 +00:00
Nico
71320af66a Fix -Wbitwise-instead-of-logical clang warning
& and | short-circuit, && and || don't. When both arguments to those
are boolean, the short-circuiting version is usually the desired one, so
clang warns on this.

Here, it is inconsequential, so switch to && and || to suppress the warning.

(cherry picked from commit b17bcddbca749f621040990a3efb840046315050)
2021-11-03 23:32:57 +00:00
Antonio Sanchez
0ab1f8ec03 Fix broadcasting oob error.
For vectorized 1-dimensional inputs that do not take the special
blocking path (e.g. `std::complex<...>`), there was an
index-out-of-bounds error causing the broadcast size to be
computed incorrectly.  Here we fix this, and make other minor
cleanup changes.

Fixes #2351.


(cherry picked from commit a500da1dc089b08e2f2b3b05a2eb23194425460e)
2021-11-03 23:30:47 +00:00
Antonio Sanchez
f9b2e92040 Remove bad "take" impl that causes g++-11 crash.
For some reason, having `take<n, numeric_list<T>>` for `n > 0` causes
g++-11 to ICE with
```
sorry, unimplemented: unexpected AST of kind nontype_argument_pack
```
It does work with other versions of gcc, and with clang.
I filed a GCC bug
[here](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102999).

Technically we should never actually run into this case, since you
can't take n > 0 elements from an empty list.  Commenting it out
allows our Eigen tests to pass.


(cherry picked from commit 8f8c2ba2fe19c6c2e47bbe2fbaf87594642e523d)
2021-11-03 23:26:34 +00:00
Maxiwell S. Garcia
b8cf1ed753 Rename 'vec_all_nan' of cxx11_tensor_expr test because this symbol is used by altivec.h
(cherry picked from commit 09fc0f97b53e22d8fef94acf0fbfeed3717ab906)
2021-09-01 17:26:59 +00:00
Antonio Sanchez
c2b6df6e60 Disable cuda Eigen::half vectorization on host.
All cuda `__half` functions are device-only in CUDA 9, including
conversions. Host-side conversions were added in CUDA 10.
The existing code doesn't build prior to 10.0.

All arithmetic functions are always device-only, so there's
therefore no reason to use vectorization on the host at all.

Modified the code to disable vectorization for `__half` on host,
which required also updating the `TensorReductionGpu` implementation
which previously made assumptions about available packets.


(cherry picked from commit cc3573ab4451853774cd5c3497373d5fe8914774)
2021-08-31 21:23:11 +00:00
jenswehner
338924602d added includes for unordered_map
(cherry picked from commit e3e74001f7c4bf95f0dde572e8a08c5b2918a3ab)
2021-08-10 16:10:03 +00:00
Antonio Sanchez
46ecdcd745 Fix MPReal detection and support.
The latest version of `mpreal` has a bug that breaks `min`/`max`.
It also breaks with the latest dev version of `mpfr`. Here we
add `FindMPREAL.cmake` which searches for the library and tests if
compilation works.

Removed our internal copy of `mpreal.h` under `unsupported/test`, as
it is out-of-sync with the latest, and similarly breaks with
the latest `mpfr`.  It would be best to use the installed version
of `mpreal` anyways, since that's what we actually want to test.

Fixes #2282.


(cherry picked from commit 31f796ebef35eeadd0e26878aab3fe99ca412a45)
2021-08-03 18:13:12 +00:00
Antonio Sanchez
bb33880e57 Fix TriSycl CMake files.
This is to enable compiling with the latest trisycl. `FindTriSYCL.cmake` was
broken by commit 00f32752, which modified `add_sycl_to_target` for ComputeCPP.
This makes the corresponding modifications for trisycl to make them consistent.

Also, trisycl now requires c++17.


(cherry picked from commit 8cf6cb27baa9607cc00e5dbb42a1c31efda41b74)
2021-08-03 17:25:17 +00:00
Alexander Karatarakis
c334eece44 _DerType -> DerivativeType as underscore-followed-by-caps is a reserved identifier
(cherry picked from commit f357283d3128a6253af09705155ce4f9f113e3c8)
2021-07-29 18:18:47 +00:00
Jonas Harsch
5a3c9eddb4 Removed superfluous boolean degenerate in TensorMorphing.h.
(cherry picked from commit e9c9a3130b7307a240335aa527a6d4c5fb2ee471)
2021-07-08 18:34:10 +00:00
Antonio Sanchez
84955d109f Fix Tensor documentation page.
The extra [TOC] tag is generating a huge floating duplicated
table-of-contents, which obscures the majority of the page
(see bottom of https://eigen.tuxfamily.org/dox/unsupported/eigen_tensors.html).
Remove it.

Also, headers do not support markup (see
[doxygen bug](https://github.com/doxygen/doxygen/issues/7467)), so
backticks like
```
```
end up generating titles that looks like
```
Constructor <tt>Tensor<double,2></tt>
```
Removing backticks for now.  To generate proper formatted headers, we
must directly use html instead of markdown, i.e.
```
<h2>Constructor <code>Tensor&lt;double,2&gt;</code></h2>
```
which is ugly.

Fixes #2254.


(cherry picked from commit f5a9873bbb5488bcba3e37f92b4ec09a8db76081)
2021-07-07 17:18:20 +00:00
Jonas Harsch
601814b575 Don't crash when attempting to shuffle an empty tensor.
(cherry picked from commit aab747021be5ed1a1e9667243d884eb72003599d)
2021-07-02 21:08:38 +00:00
Antonio Sanchez
8190739f12 Fix compile issues for gcc 4.8.
- Move constructors can only be defaulted as NOEXCEPT if all members
have NOEXCEPT move constructors.
- gcc 4.8 has some funny parsing bug in `a < b->c`, thinking `b-` is a template parameter.


(cherry picked from commit 6035da5283f12f7e6a49cda0c21696c8e5a115b7)
2021-07-01 23:18:10 +00:00
Antonio Sanchez
d82d915047 Modify tensor argmin/argmax to always return first occurence.
As written, depending on multithreading/gpu, the returned index from
`argmin`/`argmax` is not currently stable.  Here we modify the functors
to always keep the first occurence (i.e. if the value is equal to the
current min/max, then keep the one with the smallest index).

This is otherwise causing unpredictable results in some TF tests.


(cherry picked from commit 3a087ccb99b454dc34484333e608e836e7032213)
2021-06-29 23:28:37 +00:00
Antonio Sanchez
a2040ef796 Rewrite balancer to avoid overflows.
The previous balancer overflowed for large row/column norms.
Modified to prevent that.

Fixes #2273.


(cherry picked from commit e9ab4278b7aba6f279c964d99ae5a312d12ab04b)
2021-06-21 18:14:53 +00:00
Antonio Sanchez
2d6eaaf687 Fix placement of permanent GPU defines.
(cherry picked from commit 954879183b1e008d7f0fefb97e48a925c4e3fb16)
2021-06-15 19:18:20 +00:00
Rasmus Munk Larsen
47722a66f2 Fix more enum arithmetic.
(cherry picked from commit 13fb5ab92c3226f7b9be20882b0418d53516d35a)
2021-06-15 16:40:35 +00:00
Antonio Sanchez
b5fc69bdd8 Add ability to permanently enable HIP/CUDA gpu* defines.
When using Eigen for gpu, these simplify portability.  If
`EIGEN_PERMANENTLY_ENABLE_GPU_HIP_CUDA_DEFINES` is set, then
we do not undefine them.


(cherry picked from commit 514977f31b1c00b233969f12321a25d859dd1efa)
2021-06-11 17:48:37 +00:00
Antonio Sanchez
4b683b65df Allow custom TENSOR_CONTRACTION_DISPATCH macro.
Currently TF lite needs to hack around with the Tensor headers in order
to customize the contraction dispatch method. Here we add simple `#ifndef`
guards to allow them to provide their own dispatch prior to inclusion.


(cherry picked from commit 6aec83263d32c29f6c5623b9716ec7e367693078)
2021-06-11 17:19:29 +00:00
Rohit Santhanam
cbb6ae6296 Removed dead code from GPU float16 unit test.
(cherry picked from commit c8d40a7bf1915015c991b108cf2cd6a32138fdc8)
2021-06-10 17:16:47 +00:00