1208 Commits

Author SHA1 Message Date
Rasmus Munk Larsen
3c4637640b Remove unused typedef. 2022-09-23 19:11:31 +00:00
Romain Biessy
2f7cce2dd5 [SYCL] Fix some SYCL tests 2022-08-16 17:37:54 +00:00
Julian Kent
69714ff613 Add Sparse Subset of Matrix Inverse 2022-07-28 18:04:35 +00:00
Antonio Sanchez
0e18714167 Fix clang-tidy warnings about function definitions in headers. 2022-06-24 15:10:58 +00:00
Oleg Shirokobrod
f542b0a71f Adding an MKL adapter in FFT module. 2022-06-02 18:10:43 +00:00
Tobias Wood
a9868bd5be Add arg() to tensor 2022-05-20 03:33:01 +00:00
Guoqiang QI
00b75375e7 Adding PocketFFT support in FFT module since kissfft has some flaw in accuracy and performance 2022-05-11 17:44:22 +00:00
Erik Schultheis
b9d2900e8f added a missing typename and fixed a unused typedef warning 2022-03-24 12:07:18 +02:00
Essex Edwards
cd3c81c3bc Add a NNLS solver to unsupported - issue #655 2022-03-23 20:20:44 +00:00
Erik Schultheis
421cbf0866 Replace Eigen type metaprogramming with corresponding std types and make use of alias templates 2022-03-16 16:43:40 +00:00
Antonio Sánchez
9296bb4b93 Fix edge-case in zeta for large inputs. 2022-03-08 21:21:20 +00:00
Antonio Sánchez
008ff3483a Fix broken tensor executor test, allow tensor packets of size 1. 2022-03-07 20:30:37 +00:00
Antonio Sánchez
d819a33bf6 Remove poor non-convergence checks in NonLinearOptimization. 2022-03-02 19:31:20 +00:00
Antonio Sanchez
1c2690ed24 Adjust tolerance of matrix_power test for MSVC. 2022-03-01 23:33:05 +00:00
Antonio Sánchez
ae86a146b1 Modify test expression to avoid numerical differences (#2402). 2022-02-23 16:37:03 +00:00
Romain Biessy
2dd879d4b0 [SYCL] Fix CMake for SYCL support 2022-02-22 16:53:27 +00:00
Rasmus Munk Larsen
18eab8f997 Add convenience method constexpr std::size_t size() const to Eigen::IndexList 2022-02-12 04:23:03 +00:00
Rasmus Munk Larsen
ea2c02060c Add reciprocal packet op and fast specializations for float with SSE, AVX, and AVX512. 2022-01-21 23:49:18 +00:00
Erik Schultheis
970640519b Cleanup 2022-01-21 01:48:59 +00:00
Kolja Brix
8d81a2339c Reduce usage of reserved names 2022-01-10 20:53:29 +00:00
Jens Wehner
c6fa0ca162 Idrsstabl 2021-12-06 20:00:00 +00:00
Erik Schultheis
cc11e240ac Some further cleanup 2021-12-06 18:01:15 +00:00
Jens Wehner
f63c6dd1f9 Bicgstabl 2021-12-02 22:48:22 +00:00
Xinle Liu
7ef5f0641f Remove macro EIGEN_GPU_TEST_C99_MATH
Remove macro EIGEN_GPU_TEST_C99_MATH which is used in a single test file only and always defaults to true.
2021-12-01 14:48:56 +00:00
Erik Schultheis
ec2fd0f7ed Require recent GCC and MSCV and removed EIGEN_HAS_CXX14 and some other feature test macros 2021-12-01 00:48:34 +00:00
Erik Schultheis
4a76880351 Updated CMake
This patch updates the minimum required CMake version to 3.10 and removes the EIGEN_TEST_CXX11 CMake option, including corresponding logic.
2021-11-29 20:24:20 +00:00
Erik Schultheis
f33a31b823 removed EIGEN_HAS_CXX11_* and redundant EIGEN_COMP_CXXVER checks 2021-11-29 19:18:57 +00:00
David Tellenbach
08da52eb85 Remove DenseBase::nonZeros() which just calls DenseBase::size()
Fixes #2382.
2021-11-27 14:31:00 +00:00
Erik Schultheis
ec4efbd696 remove EIGEN_HAS_CXX11 2021-11-24 20:08:49 +00:00
Rasmus Munk Larsen
96aeffb013 Make the new TensorIO implementation work with TensorMap with const elements. 2021-11-17 18:16:04 -08:00
cpp977
f73c95c032 Reimplemented the Tensor stream output. 2021-11-16 17:36:58 +00:00
Ben Barsdell
50df8d3d6d Avoid integer overflow in EigenMetaKernel indexing
- The current implementation computes `size + total_threads`, which can
  overflow and cause CUDA_ERROR_ILLEGAL_ADDRESS when size is close to
  the maximum representable value.
- The num_blocks calculation can also overflow due to the implementation
  of divup().
- This patch prevents these overflows and allows the kernel to work
  correctly for the full representable range of tensor sizes.
- Also adds relevant tests.
2021-11-05 16:39:37 +11:00
Rasmus Munk Larsen
55e3ae02ac Compare summation results against forward error bound. 2021-11-04 18:04:04 -07:00
Antonio Sanchez
f6c8cc0e99 Fix TensorReduction warnings and error bound for sum accuracy test.
The sum accuracy test currently uses the default test precision for
the given scalar type.  However, scalars are generated via a normal
distribution, and given a large enough count and strong enough random
generator, the expected sum is zero.  This causes the test to
periodically fail.

Here we estimate an upper-bound for the error as `sqrt(N) * prec` for
summing N values, with each having an approximate epsilon of `prec`.

Also fixed a few warnings generated by MSVC when compiling the
reduction test.
2021-10-30 14:59:00 -07:00
Rohit Santhanam
48e40b22bf Preliminary HIP bfloat16 GPU support. 2021-10-27 18:36:45 +00:00
Antonio Sánchez
185ad0e610 Revert "Avoid integer overflow in EigenMetaKernel indexing"
This reverts commit 100d7caf920920657c5c559e6b1e0a322d0d1f98
2021-10-27 14:55:25 +00:00
Ben Barsdell
100d7caf92 Avoid integer overflow in EigenMetaKernel indexing
- The current implementation computes `size + total_threads`, which can
  overflow and cause CUDA_ERROR_ILLEGAL_ADDRESS when size is close to
  the maximum representable value.
- The num_blocks calculation can also overflow due to the implementation
  of divup().
- This patch prevents these overflows and allows the kernel to work
  correctly for the full representable range of tensor sizes.
- Also adds relevant tests.
2021-10-26 00:04:28 +00:00
Antonio Sanchez
a500da1dc0 Fix broadcasting oob error.
For vectorized 1-dimensional inputs that do not take the special
blocking path (e.g. `std::complex<...>`), there was an
index-out-of-bounds error causing the broadcast size to be
computed incorrectly.  Here we fix this, and make other minor
cleanup changes.

Fixes #2351.
2021-10-25 19:31:12 +00:00
Rasmus Munk Larsen
360290fc42 Improve accuracy of full tensor reduction for half and bfloat16 by reducing leaf size in tree reduction.
Add more unit tests for summation accuracy.
2021-10-20 19:54:06 +00:00
Antonio Sanchez
be9e7d205f Reduce tensor_contract_gpu test.
The original test times out after 60 minutes on Windows, even when
setting flags to optimize for speed.  Reducing the number of
contractions performed from 3600->27 for subtests 8,9 allow the
two to run in just over a minute each.
2021-10-02 04:36:15 +00:00
Antonio Sanchez
701f5d1c91 Fix gpu special function tests.
Some checks used incorrect values, partly from copy-paste errors,
partly from the change in behaviour introduced in !398.

Modified results to match scipy, simplified tests by updating
`VERIFY_IS_CWISE_APPROX` to work for scalars.
2021-10-01 10:20:50 -07:00
Antonio Sanchez
de218b471d Add -arch=<arch> argument for nvcc.
Without this flag, when compiling with nvcc, if the compute architecture of a card does
not exactly match any of those listed for `-gencode arch=compute_<arch>,code=sm_<arch>`,
then the kernel will fail to run with:
```
cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device.
```
This can happen, for example, when compiling with an older cuda version
that does not support a newer architecture (e.g. T4 is `sm_75`, but cuda
9.2 only supports up to `sm_70`).

With the `-arch=<arch>` flag, the code will compile and run at the
supplied architecture.
2021-09-24 20:48:01 -07:00
Antonio Sanchez
846d34384a Rename EIGEN_CUDA_FLAGS to EIGEN_CUDA_CXX_FLAGS
Also add a missing space for clang.
2021-09-24 20:15:55 -07:00
Antonio Sanchez
7b00e8b186 Clean up CUDA CMake files.
- Unify test/CMakeLists.txt and unsupported/test/CMakeLists.txt
- Added `EIGEN_CUDA_FLAGS` that are appended to the set of flags passed
to the cuda compiler (nvcc or clang).

The latter is to support passing custom flags (e.g. `-arch=` to nvcc,
or to disable cuda-specific warnings).
2021-09-24 14:43:59 -07:00
Kolja Brix
afa616bc9e Fix some typos found 2021-09-23 15:22:00 +00:00
Rasmus Munk Larsen
6cadab6896 Clean up EIGEN_STATIC_ASSERT to only use standard c++11 static_assert. 2021-09-16 20:43:54 +00:00
Antonio Sanchez
eea2a3385c Remove more DynamicSparseMatrix references.
Also fixed some typos in SparseExtra/MarketIO.h.
2021-09-02 15:36:47 -07:00
Jens Wehner
8286073c73 Matrixmarket extension 2021-09-02 17:23:33 +00:00
Antonio Sanchez
74da2e6821 Rename Tuple -> Pair.
This is to make way for a new `Tuple` class that mimics `std::tuple`,
but can be reliably used on device and with aligned Eigen types.

The existing Tuple has very few references, and is actually an
analogue of `std::pair`.
2021-09-02 02:20:54 +00:00
Maxiwell S. Garcia
09fc0f97b5 Rename 'vec_all_nan' of cxx11_tensor_expr test because this symbol is used by altivec.h 2021-09-01 16:42:51 +00:00