eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-07-16 18:11:47 +08:00

Author	SHA1	Message	Date
Antonio Sanchez	ac561cd038	Reduce tensor_contract_gpu test. The original test times out after 60 minutes on Windows, even when setting flags to optimize for speed. Reducing the number of contractions performed from 3600->27 for subtests 8,9 allow the two to run in just over a minute each. (cherry picked from commit be9e7d205f38e3e8effdfdded88817b371673930)	2023-07-11 11:27:31 -07:00
Antonio Sanchez	554982beef	Disable Tree reduction for GPU. For moderately sized inputs, running the Tree reduction quickly fills/overflows the GPU thread stack space, leading to memory errors. This was happening in the `cxx11_tensor_complex_gpu` test, for example. Disabling tree reduction on GPU fixes this. (cherry picked from commit 24ebb37f38287d65c0e0b60c714e39faffeb5b94)	2023-07-10 16:09:30 -07:00
Antonio Sanchez	89a71f3126	Fix gpu special function tests. Some checks used incorrect values, partly from copy-paste errors, partly from the change in behaviour introduced in !398. Modified results to match scipy, simplified tests by updating `VERIFY_IS_CWISE_APPROX` to work for scalars. (cherry picked from commit 701f5d1c91c770e558c7760da14ff3365757e275)	2023-07-10 15:57:08 -07:00
Antonio Sanchez	a605d6b996	Rename EIGEN_CUDA_FLAGS to EIGEN_CUDA_CXX_FLAGS Also add a missing space for clang. (cherry picked from commit 846d34384af80b80793d32257a7f917eeece41d4)	2023-07-10 15:30:41 -07:00
Antonio Sanchez	dfcd6de20a	Clean up CUDA CMake files. - Unify test/CMakeLists.txt and unsupported/test/CMakeLists.txt - Added `EIGEN_CUDA_FLAGS` that are appended to the set of flags passed to the cuda compiler (nvcc or clang). The latter is to support passing custom flags (e.g. `-arch=` to nvcc, or to disable cuda-specific warnings). (cherry picked from commit 7b00e8b186a7679b0f46be742809a55d07d4efe8)	2023-07-10 15:30:41 -07:00
Antonio Sánchez	26b8fabd80	Return NaN in ndtri for values outside valid input range. (cherry picked from commit 1f79a6078fb77da47069c8aec23c4e309fb982e2)	2023-07-10 14:52:08 -07:00
Rasmus Munk Larsen	f296720d7d	Make sure we return +/-1 above the clamping point for Erf(). (cherry picked from commit b378014fef017a829fb42c7fad15f3764bfb8ef9)	2023-07-10 14:52:08 -07:00
Rasmus Munk Larsen	d4c24eca96	Don't crash on empty tensor contraction. (cherry picked from commit b0f877f8e01e90a5b0f3a79d46ea234899f8b499)	2023-07-10 14:52:08 -07:00
Antonio Sánchez	72b0759451	Fix arm builds. (cherry picked from commit 2c8011c2dd72d6c2086b181aad8cbb6204fed5db)	2023-07-10 14:52:08 -07:00
Chip Kerchner	8f1b6198c2	Fix epsilon and dummy_precision values in long double for double doubles. Prevented some algorithms from converging on PPC. (cherry picked from commit 54459214a1b9c67df04bc529474fca1ec9f4c84f)	2023-07-10 14:52:08 -07:00
Antonio Sánchez	669dc8fadf	Eliminate bool bitwise warnings. (cherry picked from commit b8e93bf589fa66da404c66c48dc512b3e7484713)	2023-07-07 15:21:17 -07:00
Antonio Sánchez	ea57f9b78f	AutoDiff depends on Core, so include appropriate header. (cherry picked from commit e1165dbf9a16527ab085bec2749b02096fa1b584)	2023-07-07 15:21:17 -07:00
Antonio Sánchez	f55a112cb1	Fix ODR violations. (cherry picked from commit bb51d9f4fa3cf1114348b9180640d6da7d3964f9)	2023-07-07 15:21:17 -07:00
Antonio Sánchez	a11bdf3965	Skip f16/bf16 bessel specializations on AVX512 if unavailable. (cherry picked from commit 8ed3b9dcd6dd2e58ec0ad27438d09a90c72e549a)	2023-07-07 15:21:17 -07:00
Antonio Sánchez	80c5b8b3c3	Fix ambiguous comparisons for c++20 (again again) (cherry picked from commit 8c2e0e3cb8c6ddcd828d6f1d2062a243c0dc9948)	2023-07-07 15:21:17 -07:00
Antonio Sánchez	af912a7b5c	Fix MSVC+CUDA issues. (cherry picked from commit 5ed7a86ae96d411c450fb190f5a725f38f2aea9d)	2023-07-07 15:21:17 -07:00
Antonio Sanchez	ac78f84b72	Eliminate trace unused warning. (cherry picked from commit 9bc9992dd37e0379be888186a234b7641af306f7)	2023-07-07 15:06:18 -07:00
Antonio Sánchez	b158fcaa74	Fix edge-case in zeta for large inputs. (cherry picked from commit 9296bb4b933973365d19b4b71e7d2b205d00a1ad)	2023-07-07 15:06:18 -07:00
Antonio Sánchez	b30a2a527e	Remove poor non-convergence checks in NonLinearOptimization. (cherry picked from commit d819a33bf64c4fce95c55f8e44a68b486f064a79)	2023-07-07 11:50:25 -07:00
Antonio Sanchez	bc1b354b32	Adjust tolerance of matrix_power test for MSVC. (cherry picked from commit 1c2690ed248327539f7a248ddb12e1690da81b68)	2023-07-07 11:50:02 -07:00
Antonio Sánchez	36be6747e0	Modify test expression to avoid numerical differences (#2402 ). (cherry picked from commit ae86a146b1ac9a49bf72e485254c08d237fd094a)	2023-07-07 11:45:56 -07:00
Antonio Sanchez	21e0ad056e	Fix ODR failures in TensorRandom. (cherry picked from commit bded5028a5bd112181b94b2a246ac2c20e671c2f)	2023-07-07 11:43:03 -07:00
Antonio Sánchez	52e545324e	Fix ODR violations. (cherry picked from commit cafeadffef2a7ba41f2da5cf34c38068d74499eb)	2023-07-07 11:37:31 -07:00
Antonio Sánchez	f3aaba8705	Revert "Replace call to FixedDimensions() with a singleton instance of" This reverts commit 19e6496ce0c52fef33265bca54285ba77b2155be (cherry picked from commit f7b31f864c0dec7872038cab79f6e677de2ecc71)	2022-04-10 15:34:11 +00:00
Antonio Sanchez	7e3bc4177e	Fix tensor broadcast off-by-one error. Caught by JAX unit tests. Triggered if broadcast is smaller than packet size. (cherry picked from commit ffb78e23a1b3bc232a07773144cfa5fa1759852d)	2021-11-16 18:41:25 +00:00
Nico	71320af66a	Fix -Wbitwise-instead-of-logical clang warning & and \| short-circuit, && and \|\| don't. When both arguments to those are boolean, the short-circuiting version is usually the desired one, so clang warns on this. Here, it is inconsequential, so switch to && and \|\| to suppress the warning. (cherry picked from commit b17bcddbca749f621040990a3efb840046315050)	2021-11-03 23:32:57 +00:00
Antonio Sanchez	0ab1f8ec03	Fix broadcasting oob error. For vectorized 1-dimensional inputs that do not take the special blocking path (e.g. `std::complex<...>`), there was an index-out-of-bounds error causing the broadcast size to be computed incorrectly. Here we fix this, and make other minor cleanup changes. Fixes #2351. (cherry picked from commit a500da1dc089b08e2f2b3b05a2eb23194425460e)	2021-11-03 23:30:47 +00:00
Antonio Sanchez	f9b2e92040	Remove bad "take" impl that causes g++-11 crash. For some reason, having `take<n, numeric_list<T>>` for `n > 0` causes g++-11 to ICE with ``` sorry, unimplemented: unexpected AST of kind nontype_argument_pack ``` It does work with other versions of gcc, and with clang. I filed a GCC bug [here](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102999). Technically we should never actually run into this case, since you can't take n > 0 elements from an empty list. Commenting it out allows our Eigen tests to pass. (cherry picked from commit 8f8c2ba2fe19c6c2e47bbe2fbaf87594642e523d)	2021-11-03 23:26:34 +00:00
Maxiwell S. Garcia	b8cf1ed753	Rename 'vec_all_nan' of cxx11_tensor_expr test because this symbol is used by altivec.h (cherry picked from commit 09fc0f97b53e22d8fef94acf0fbfeed3717ab906)	2021-09-01 17:26:59 +00:00
Antonio Sanchez	c2b6df6e60	Disable cuda Eigen::half vectorization on host. All cuda `__half` functions are device-only in CUDA 9, including conversions. Host-side conversions were added in CUDA 10. The existing code doesn't build prior to 10.0. All arithmetic functions are always device-only, so there's therefore no reason to use vectorization on the host at all. Modified the code to disable vectorization for `__half` on host, which required also updating the `TensorReductionGpu` implementation which previously made assumptions about available packets. (cherry picked from commit cc3573ab4451853774cd5c3497373d5fe8914774)	2021-08-31 21:23:11 +00:00
jenswehner	338924602d	added includes for unordered_map (cherry picked from commit e3e74001f7c4bf95f0dde572e8a08c5b2918a3ab)	2021-08-10 16:10:03 +00:00
Antonio Sanchez	46ecdcd745	Fix MPReal detection and support. The latest version of `mpreal` has a bug that breaks `min`/`max`. It also breaks with the latest dev version of `mpfr`. Here we add `FindMPREAL.cmake` which searches for the library and tests if compilation works. Removed our internal copy of `mpreal.h` under `unsupported/test`, as it is out-of-sync with the latest, and similarly breaks with the latest `mpfr`. It would be best to use the installed version of `mpreal` anyways, since that's what we actually want to test. Fixes #2282. (cherry picked from commit 31f796ebef35eeadd0e26878aab3fe99ca412a45)	2021-08-03 18:13:12 +00:00
Antonio Sanchez	bb33880e57	Fix TriSycl CMake files. This is to enable compiling with the latest trisycl. `FindTriSYCL.cmake` was broken by commit 00f32752, which modified `add_sycl_to_target` for ComputeCPP. This makes the corresponding modifications for trisycl to make them consistent. Also, trisycl now requires c++17. (cherry picked from commit 8cf6cb27baa9607cc00e5dbb42a1c31efda41b74)	2021-08-03 17:25:17 +00:00
Alexander Karatarakis	c334eece44	_DerType -> DerivativeType as underscore-followed-by-caps is a reserved identifier (cherry picked from commit f357283d3128a6253af09705155ce4f9f113e3c8)	2021-07-29 18:18:47 +00:00
Jonas Harsch	5a3c9eddb4	Removed superfluous boolean `degenerate` in TensorMorphing.h. (cherry picked from commit e9c9a3130b7307a240335aa527a6d4c5fb2ee471)	2021-07-08 18:34:10 +00:00
Antonio Sanchez	84955d109f	Fix Tensor documentation page. The extra [TOC] tag is generating a huge floating duplicated table-of-contents, which obscures the majority of the page (see bottom of https://eigen.tuxfamily.org/dox/unsupported/eigen_tensors.html). Remove it. Also, headers do not support markup (see [doxygen bug](https://github.com/doxygen/doxygen/issues/7467)), so backticks like ``` ``` end up generating titles that looks like ``` Constructor <tt>Tensor<double,2></tt> ``` Removing backticks for now. To generate proper formatted headers, we must directly use html instead of markdown, i.e. ``` <h2>Constructor <code>Tensor<double,2></code></h2> ``` which is ugly. Fixes #2254. (cherry picked from commit f5a9873bbb5488bcba3e37f92b4ec09a8db76081)	2021-07-07 17:18:20 +00:00
Jonas Harsch	601814b575	Don't crash when attempting to shuffle an empty tensor. (cherry picked from commit aab747021be5ed1a1e9667243d884eb72003599d)	2021-07-02 21:08:38 +00:00
Antonio Sanchez	8190739f12	Fix compile issues for gcc 4.8. - Move constructors can only be defaulted as NOEXCEPT if all members have NOEXCEPT move constructors. - gcc 4.8 has some funny parsing bug in `a < b->c`, thinking `b-` is a template parameter. (cherry picked from commit 6035da5283f12f7e6a49cda0c21696c8e5a115b7)	2021-07-01 23:18:10 +00:00
Antonio Sanchez	d82d915047	Modify tensor argmin/argmax to always return first occurence. As written, depending on multithreading/gpu, the returned index from `argmin`/`argmax` is not currently stable. Here we modify the functors to always keep the first occurence (i.e. if the value is equal to the current min/max, then keep the one with the smallest index). This is otherwise causing unpredictable results in some TF tests. (cherry picked from commit 3a087ccb99b454dc34484333e608e836e7032213)	2021-06-29 23:28:37 +00:00
Antonio Sanchez	a2040ef796	Rewrite balancer to avoid overflows. The previous balancer overflowed for large row/column norms. Modified to prevent that. Fixes #2273. (cherry picked from commit e9ab4278b7aba6f279c964d99ae5a312d12ab04b)	2021-06-21 18:14:53 +00:00
Antonio Sanchez	2d6eaaf687	Fix placement of permanent GPU defines. (cherry picked from commit 954879183b1e008d7f0fefb97e48a925c4e3fb16)	2021-06-15 19:18:20 +00:00
Rasmus Munk Larsen	47722a66f2	Fix more enum arithmetic. (cherry picked from commit 13fb5ab92c3226f7b9be20882b0418d53516d35a)	2021-06-15 16:40:35 +00:00
Antonio Sanchez	b5fc69bdd8	Add ability to permanently enable HIP/CUDA gpu* defines. When using Eigen for gpu, these simplify portability. If `EIGEN_PERMANENTLY_ENABLE_GPU_HIP_CUDA_DEFINES` is set, then we do not undefine them. (cherry picked from commit 514977f31b1c00b233969f12321a25d859dd1efa)	2021-06-11 17:48:37 +00:00
Antonio Sanchez	4b683b65df	Allow custom TENSOR_CONTRACTION_DISPATCH macro. Currently TF lite needs to hack around with the Tensor headers in order to customize the contraction dispatch method. Here we add simple `#ifndef` guards to allow them to provide their own dispatch prior to inclusion. (cherry picked from commit 6aec83263d32c29f6c5623b9716ec7e367693078)	2021-06-11 17:19:29 +00:00
Rohit Santhanam	cbb6ae6296	Removed dead code from GPU float16 unit test. (cherry picked from commit c8d40a7bf1915015c991b108cf2cd6a32138fdc8)	2021-06-10 17:16:47 +00:00
Nathan Luehr	82f13830e6	Fix calls to device functions from host code (cherry picked from commit 972cf0c28a8d2ee0808c1277dea2c5c206591ce6)	2021-05-12 17:01:45 +00:00
Antonio Sanchez	25424f4cf1	Clean up gpu device properties. Made a class and singleton to encapsulate initialization and retrieval of device properties. Related to !481, which already changed the API to address a static linkage issue. (cherry picked from commit 0eba8a1fe3e0fa78f0e6760c0e1265817491845d)	2021-05-07 18:13:40 +00:00
Antonio Sanchez	da19f7a910	Simplify TensorRandom and remove time-dependence. Time-dependence prevents tests from being repeatable. This has long been an issue with debugging the tensor tests. Removing this will allow future tests to be repeatable in the usual way. Also, the recently added macros in !476 are causing headaches across different platforms. For example, checking `_XOPEN_SOURCE` is leading to multiple ambiguous macro errors across Google, and `_DEFAULT_SOURCE`/`_SVID_SOURCE`/`_BSD_SOURCE` are sometimes defined with values, sometimes defined as empty, and sometimes not defined at all when they probably should be. This is leading to multiple build breakages. The simplest approach is to generate a seed via `Eigen::internal::random<uint64_t>()` if on CPU. For GPU, we use a hash based on the current thread ID (since `rand()` isn't supported on GPU). Fixes #1602. (cherry picked from commit e3b7f59659689015aa254ed67c48d870831f086f)	2021-05-05 23:37:48 +00:00
Turing Eret	baf601a0e3	Fix for issue with static global variables in TensorDeviceGpu.h m_deviceProperties and m_devicePropInitialized are defined as global statics which will define multiple copies which can cause issues if initializeDeviceProp() is called in one translation unit and then m_deviceProperties is used in a different translation unit. Added inline functions getDeviceProperties() and getDevicePropInitialized() which defines those variables as static locals. As per the C++ standard 7.1.2/4, a static local declared in an inline function always refers to the same object, so this should be safer. Credit to Sun Chenggen for this fix. This fixes issue #1475. (cherry picked from commit 3804ca0d905a0a03357db50abc7468f5f90abc98)	2021-04-23 19:06:16 +00:00
Antonio Sanchez	587a691516	Check existence of BSD random before use. `TensorRandom` currently relies on BSD `random()`, which is not always available. The [linux manpage](https://man7.org/linux/man-pages/man3/srandom.3.html) gives the glibc condition: ``` _XOPEN_SOURCE >= 500 \|\| /* Glibc since 2.19: / _DEFAULT_SOURCE \|\| / Glibc <= 2.19: */ _SVID_SOURCE \|\| _BSD_SOURCE ``` In particular, this was failing to compile for MinGW via msys2. If not available, we fall back to using `rand()`. (cherry picked from commit 045c0609b5c059974104f29dad91bcc3828e91ac)	2021-04-23 00:35:05 +00:00

1 2 3 4 5 ...

2954 Commits