eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-07-09 14:41:49 +08:00

Author	SHA1	Message	Date
ChipKerchner	13d7658c5d	Fix errors on older compilers (gcc 7.5 - lack of vec_neg, clang10 - can not use const pointers with vec_xl). (cherry picked from commit 413bc491f1721afdb9802553b13a5b7aba67ed3b)	2021-08-10 20:40:54 +00:00
jenswehner	338924602d	added includes for unordered_map (cherry picked from commit e3e74001f7c4bf95f0dde572e8a08c5b2918a3ab)	2021-08-10 16:10:03 +00:00
Gauri Deshpande	93bff85a42	remove denormal flushing in fp32tobf16 for avx & avx512 (cherry picked from commit e6a5a594a7f3cbe2f9843d4ef57a10d478cbb818)	2021-08-09 22:15:42 +00:00
Rasmus Munk Larsen	4e0357c6dd	Avoid memory allocation in tridiagonalization_inplace_selector::run. (cherry picked from commit a5a7faeb455efd7f6edb1138eda2e37546039b7d)	2021-08-06 21:48:00 +00:00
Daniel N. Miller (APD)	1e9f623f3e	Do not build shared libs if not supported (cherry picked from commit 09d7122468fb9b9adf813cf32167ab212511c4d8)	2021-08-06 21:47:37 +00:00
Jens Wehner	4240b480e0	updated documentation for middleCol and middleRow (cherry picked from commit 4d870c49b7f1b49e34e8044dc6c1131d43e91a44)	2021-08-05 17:53:36 +00:00
Antonio Sanchez	5b83d3c4bc	Make inverse 3x3 faster and avoid gcc bug. There seems to be a gcc 4.7 bug that incorrectly flags the current 3x3 inverse as using uninitialized memory. I'm pretty sure it's a false positive, but it's hard to trigger. The same warning does not trigger with clang or later compiler versions. In trying to find a work-around, this implementation turns out to be faster anyways for static-sized matrices. ``` name old cpu/op new cpu/op delta BM_Inverse3x3<DynamicMatrix3T<float>> 423ns ± 2% 433ns ± 3% +2.32% (p=0.000 n=98+96) BM_Inverse3x3<DynamicMatrix3T<double>> 425ns ± 2% 427ns ± 3% +0.48% (p=0.003 n=99+96) BM_Inverse3x3<StaticMatrix3T<float>> 7.10ns ± 2% 0.80ns ± 1% -88.67% (p=0.000 n=114+112) BM_Inverse3x3<StaticMatrix3T<double>> 7.45ns ± 2% 1.34ns ± 1% -82.01% (p=0.000 n=105+111) BM_AliasedInverse3x3<DynamicMatrix3T<float>> 409ns ± 3% 419ns ± 3% +2.40% (p=0.000 n=100+98) BM_AliasedInverse3x3<DynamicMatrix3T<double>> 414ns ± 3% 413ns ± 2% ~ (p=0.322 n=98+98) BM_AliasedInverse3x3<StaticMatrix3T<float>> 7.57ns ± 1% 0.80ns ± 1% -89.37% (p=0.000 n=111+114) BM_AliasedInverse3x3<StaticMatrix3T<double>> 9.09ns ± 1% 2.58ns ±41% -71.60% (p=0.000 n=113+116) ``` (cherry picked from commit 5ad8b9bfe2bf75620bc89467c5cc051fc2a597df)	2021-08-04 22:06:52 +00:00
Antonio Sanchez	46ecdcd745	Fix MPReal detection and support. The latest version of `mpreal` has a bug that breaks `min`/`max`. It also breaks with the latest dev version of `mpfr`. Here we add `FindMPREAL.cmake` which searches for the library and tests if compilation works. Removed our internal copy of `mpreal.h` under `unsupported/test`, as it is out-of-sync with the latest, and similarly breaks with the latest `mpfr`. It would be best to use the installed version of `mpreal` anyways, since that's what we actually want to test. Fixes #2282. (cherry picked from commit 31f796ebef35eeadd0e26878aab3fe99ca412a45)	2021-08-03 18:13:12 +00:00
Antonio Sanchez	9a1691a14e	Fix cmake warnings, FindPASTIX/FindPTSCOTCH. We were getting a lot of warnings due to nested `find_package` calls within `Find*.cmake` files. The recommended approach is to use [`find_dependency`](https://cmake.org/cmake/help/latest/module/CMakeFindDependencyMacro.html) in package configuration files. I made this change for all instances. Case mismatches between `Find<Package>.cmake` and calling `find_package(<PACKAGE>`) also lead to warnings. Fixed for `FindPASTIX.cmake` and `FindSCOTCH.cmake`. `FindBLASEXT.cmake` was broken due to calling `find_package_handle_standard_args(BLAS ...)`. The package name must match, otherwise the `find_package(BLASEXT)` falsely thinks the package wasn't found. I changed to `BLASEXT`, but then also copied that value to `BLAS_FOUND` for compatibility. `FindPastix.cmake` had a typo that incorrectly added `PTSCOTCH` when looking for the `SCOTCH` component. `FindPTSCOTCH` incorrectly added `*-NOTFOUND` to include/library lists, corrupting them. This led to cmake errors down-the-line. Fixes #2288. (cherry picked from commit 1cdec386530c6b844389b96c199e723a1e4e71c7)	2021-08-03 17:48:20 +00:00
Antonio Sanchez	bb33880e57	Fix TriSycl CMake files. This is to enable compiling with the latest trisycl. `FindTriSYCL.cmake` was broken by commit 00f32752, which modified `add_sycl_to_target` for ComputeCPP. This makes the corresponding modifications for trisycl to make them consistent. Also, trisycl now requires c++17. (cherry picked from commit 8cf6cb27baa9607cc00e5dbb42a1c31efda41b74)	2021-08-03 17:25:17 +00:00
Antonio Sanchez	237c59a2aa	Modify scalar pzero, ptrue, pselect, and p<binary> operations to avoid memset. The `memset` function and bitwise manipulation only apply to POD types that do not require initialization, otherwise resulting in UB. We currently violate this in `ptrue` and `pzero`, we assume bitmasks for `pselect`, and bitwise operations are applied byte-by-byte in the generic implementations. This is causing issues for scalar types that do require initialization or that contain non-POD info such as pointers (#2201). We either break them, or force specializations of these functions for custom scalars, even if they are not vectorized. Here we modify these functions for scalars only - instead using only scalar operations: - `pzero`: `Scalar(0)` for all scalars. - `ptrue`: `Scalar(1)` for non-trivial scalars, bitset to one bits for trivial scalars. - `pselect`: ternary select comparing mask to `Scalar(0)` for all scalars - `pand`, `por`, `pxor`, `pnot`: use operators `&`, `\|`, `^`, `~` for all integer or non-trivial scalars, otherwise apply bytewise. For non-scalar types, the original implementations are used to maintain compatibility and minimize the number of changes. Fixes #2201. (cherry picked from commit 3d98a6ef5ce0ba85acaee4ffffc53f0f21bd8fd2)	2021-08-03 16:32:59 +00:00
Antonio Sanchez	3dc42eeaec	Enable equality comparisons on GPU. Since `std::equal_to::operator()` is not a device function, it fails on GPU. On my device, I seem to get a silent crash in the kernel (no reported error, but the kernel does not complete). Replacing this with a portable version enables comparisons on device. Addresses #2292 - would need to be cherry-picked. The 3.3 branch also requires adding `EIGEN_DEVICE_FUNC` in `BooleanRedux.h` to get fully working. (cherry picked from commit 7880f10526a11dc5544426c54c5763de576bf285)	2021-08-03 16:15:44 +00:00
hyunggi-sv	7adc1545b4	fix:typo in dox (has->have) (cherry picked from commit 02a0e79c701da7aa8dfad79b13cd1e7fae46d634)	2021-08-03 00:54:41 +00:00
Antonio Sanchez	c0c7b695cd	Fix assignment operator issue for latest MSVC+NVCC. Details are scattered across #920, #1000, #1324, #2291. Summary: some MSVC versions have a bug that requires omitting explicit `operator=` definitions (leads to duplicate definition errors), and some MSVC versions require adding explicit `operator=` definitions (otherwise implicitly deleted errors). This mess tries to cover all the cases encountered. Fixes #2291. (cherry picked from commit 9816fe59b47dc4c07967b5ee93a8e8aaa6e9c308)	2021-08-03 00:52:21 +00:00
Alexander Karatarakis	c334eece44	_DerType -> DerivativeType as underscore-followed-by-caps is a reserved identifier (cherry picked from commit f357283d3128a6253af09705155ce4f9f113e3c8)	2021-07-29 18:18:47 +00:00
Jonas Harsch	5ccb72b2e4	Fixed typo in TutorialSparse.dox (cherry picked from commit 5b81764c0f4e06ff12a0c769b1bd876b10ad7502)	2021-07-26 14:33:10 +00:00
arthurfeeney	9c90d5d832	Fixes #1387 for compilation error in JacobiSVD with HouseholderQRPreconditioner that occurs when input is a compile-time row vector. (cherry picked from commit a77638387dd1aa2d07d2dae240cc30b303b4ef38)	2021-07-22 18:01:55 +00:00
Antonio Sanchez	5d37114fc0	Fix explicit default cache size typo. (cherry picked from commit 297f0f563d916260665d7fadc017f94f1a5e7a03)	2021-07-20 18:42:25 +00:00
Rohit Santhanam	930696fc53	Enable extract et. al. for HIP GPU. (cherry picked from commit beea14a18f76817439b4d8901d29db2e9c4a24c8)	2021-07-09 16:14:19 +00:00
Rasmus Munk Larsen	56966fd2e6	Defer to std::fill_n when filling a dense object with a constant value. (cherry picked from commit 0c361c4899c9042d2b25cd60d7826ab464caacb7)	2021-07-09 03:59:56 +00:00
Jonas Harsch	5a3c9eddb4	Removed superfluous boolean `degenerate` in TensorMorphing.h. (cherry picked from commit e9c9a3130b7307a240335aa527a6d4c5fb2ee471)	2021-07-08 18:34:10 +00:00
Guoqiang QI	69ec4907da	Make a copy of input matrix when try to do the inverse in place, this fixes #2285 . (cherry picked from commit 4bcd42c271761dc5341f8e08ca7d357c3614cb01)	2021-07-08 17:07:54 +00:00
Antonio Sanchez	7571704a43	Fix CMake directory issues. Allows absolute and relative paths for - `INCLUDE_INSTALL_DIR` - `CMAKEPACKAGE_INSTALL_DIR` - `PKGCONFIG_INSTALL_DIR` Type should be `PATH` not `STRING`. Contrary to !211, these don't seem to be made absolute if user-defined - according to the doc any directories should use `PATH` type, which allows a file dialog to be used via the GUI. It also better handles file separators. If user provides an absolute path, it will be made relative to `CMAKE_INSTALL_PREFIX` so that the `configure_packet_config_file` will work. Fixes #2155 and #2269. (cherry picked from commit f44f05532decf830fcdb07e2a67a2fa4ccbc3870)	2021-07-07 17:44:00 +00:00
Antonio Sanchez	84955d109f	Fix Tensor documentation page. The extra [TOC] tag is generating a huge floating duplicated table-of-contents, which obscures the majority of the page (see bottom of https://eigen.tuxfamily.org/dox/unsupported/eigen_tensors.html). Remove it. Also, headers do not support markup (see [doxygen bug](https://github.com/doxygen/doxygen/issues/7467)), so backticks like ``` ``` end up generating titles that looks like ``` Constructor <tt>Tensor<double,2></tt> ``` Removing backticks for now. To generate proper formatted headers, we must directly use html instead of markdown, i.e. ``` <h2>Constructor <code>Tensor<double,2></code></h2> ``` which is ugly. Fixes #2254. (cherry picked from commit f5a9873bbb5488bcba3e37f92b4ec09a8db76081)	2021-07-07 17:18:20 +00:00
Jonas Harsch	601814b575	Don't crash when attempting to shuffle an empty tensor. (cherry picked from commit aab747021be5ed1a1e9667243d884eb72003599d)	2021-07-02 21:08:38 +00:00
Rasmus Munk Larsen	05bab8139a	Fix breakage of conj_helper in conjunction with custom types introduced in !537 . (cherry picked from commit 7b35638ddb99a0298c5d3450de506a8e8e0203d3)	2021-07-02 20:59:50 +00:00
Chip Kerchner	eebde572d9	Create the ability to disable the specialized gemm_pack_rhs in Eigen (only PPC) for TensorFlow (cherry picked from commit 91e99ec1e02100d07e35a7abb1b5c76707237219)	2021-07-01 23:32:38 +00:00
Antonio Sanchez	8190739f12	Fix compile issues for gcc 4.8. - Move constructors can only be defaulted as NOEXCEPT if all members have NOEXCEPT move constructors. - gcc 4.8 has some funny parsing bug in `a < b->c`, thinking `b-` is a template parameter. (cherry picked from commit 6035da5283f12f7e6a49cda0c21696c8e5a115b7)	2021-07-01 23:18:10 +00:00
Antonio Sanchez	b6db013435	Fix inverse nullptr/asan errors for LU. For empty or single-column matrices, the current `PartialPivLU` currently dereferences a `nullptr` or accesses memory out-of-bounds. Here we adjust the checks to avoid this. (cherry picked from commit 154f00e9eacaec5667215784c7601b55024e2f61)	2021-07-01 22:57:25 +00:00
Dan Miller	1f6b1c1a1f	Fix duplicate definitions on Mac (cherry picked from commit eb047759030558acf0764d5d2f913f4f84cf85a8)	2021-07-01 20:49:05 +00:00
Alexander Karatarakis	517294d6e1	Make DenseStorage<> trivially_copyable (cherry picked from commit 60400334a92268272c6bf525da89eec5e99c3e5a)	2021-07-01 20:48:47 +00:00
大河メタル	94e2250b36	Correct declarations for aarch64-pc-windows-msvc (cherry picked from commit c81da59a252b3479753b2eada26ee0cf46280bd0)	2021-06-30 04:10:04 +00:00
Antonio Sanchez	d82d915047	Modify tensor argmin/argmax to always return first occurence. As written, depending on multithreading/gpu, the returned index from `argmin`/`argmax` is not currently stable. Here we modify the functors to always keep the first occurence (i.e. if the value is equal to the current min/max, then keep the one with the smallest index). This is otherwise causing unpredictable results in some TF tests. (cherry picked from commit 3a087ccb99b454dc34484333e608e836e7032213)	2021-06-29 23:28:37 +00:00
Rasmus Munk Larsen	380d0e4916	Get rid of redundant `pabs` instruction in complex square root. (cherry picked from commit 5aebbe9098f53f01c99eed67b52725397e955280)	2021-06-29 23:27:09 +00:00
Rohit Santhanam	e83af2cc24	Commit 52a5f982 broke conjhelper functionality for HIP GPUs. This commit addresses this. (cherry picked from commit 2d132d17365ffc84c0cc7a7da9b8f7090e94b476)	2021-06-25 19:56:18 +00:00
Rasmus Munk Larsen	413ff2b531	Small cleanup: Get rid of the macros EIGEN_HAS_SINGLE_INSTRUCTION_CJMADD and CJMADD, which were effectively unused, apart from on x86, where the change results in identically performing code. (cherry picked from commit bffd267d176410a517a0fe9afa6dde99c213c08a)	2021-06-25 17:13:12 +00:00
Rasmus Munk Larsen	a235ddef39	Get rid of code duplication for conj_helper. For packets where LhsType=RhsType a single generic implementation suffices. For scalars, the generic implementation of pconj automatically forwards to numext::conj, so much of the existing specialization can be avoided. For mixed types we still need specializations. (cherry picked from commit 52a5f9821235e5a9f7e9b3e0198d45d42a1cb267)	2021-06-24 23:30:42 +00:00
Rasmus Munk Larsen	4780d8dfb2	Fix typo in SelfAdjointEigenSolver_eigenvectors.cpp (cherry picked from commit c8a2b4d20a162dc2527425f40cf7df95db5ba428)	2021-06-21 19:07:17 +00:00
Rasmus Munk Larsen	fd5d23fdf3	Update ComplexEigenSolver_eigenvectors.cpp (cherry picked from commit ea62c937edcc2c5efdaccfb6813ca39f48564ece)	2021-06-21 19:06:54 +00:00
Antonio Sanchez	a2040ef796	Rewrite balancer to avoid overflows. The previous balancer overflowed for large row/column norms. Modified to prevent that. Fixes #2273. (cherry picked from commit e9ab4278b7aba6f279c964d99ae5a312d12ab04b)	2021-06-21 18:14:53 +00:00
Antonio Sanchez	c2c0f6f64b	Fix fix<> for gcc-4.9.3. There's a missing `EIGEN_HAS_CXX14` -> `EIGEN_HAS_CXX14_VARIABLE_TEMPLATES` replacement. Fixes ##2267 (cherry picked from commit 35a367d557078462a0793c88c44dcad64fc63698)	2021-06-21 17:26:07 +00:00
Antonio Sanchez	ee4e099aa2	Remove pset, replace with ploadu. We can't make guarantees on alignment for existing calls to `pset`, so we should default to loading unaligned. But in that case, we should just use `ploadu` directly. For loading constants, this load should hopefully get optimized away. This is causing segfaults in Google Maps. (cherry picked from commit 12e8d57108c50d8a63605c6eb0144c838c128337)	2021-06-17 17:11:08 +00:00
Chip-Kerchner	9fc93ce31a	EIGEN_STRONG_INLINE was NOT inlining in some critical needed areas (6.6X slowdown) when used with Tensorflow. Changing to EIGEN_ALWAYS_INLINE where appropiate. (cherry picked from commit ef1fd341a895fda883f655102f371fa8b41f2088)	2021-06-16 22:14:17 +00:00
Antonio Sanchez	1374f49f28	Add missing ppc pcmp_lt_or_nan<Packet8bf> (cherry picked from commit 9e94c5957000c38a6553552c96a7a27b1fc2860d)	2021-06-15 22:12:22 +00:00
Antonio Sanchez	2d6eaaf687	Fix placement of permanent GPU defines. (cherry picked from commit 954879183b1e008d7f0fefb97e48a925c4e3fb16)	2021-06-15 19:18:20 +00:00
Rasmus Munk Larsen	47722a66f2	Fix more enum arithmetic. (cherry picked from commit 13fb5ab92c3226f7b9be20882b0418d53516d35a)	2021-06-15 16:40:35 +00:00
Antonio Sanchez	5e75331b9f	Fix checking of version number for mingw. MinGW spits out version strings like: `x86_64-w64-mingw32-g++ (GCC) 10-win32 20210110`, which causes the version extraction to fail. Added support for this with tests. Also added `make_unsigned` for `long long`, since mingw seems to use that for `uint64_t`. Related to #2268. CMake and build passes for me after this. (cherry picked from commit ad82d20cf649ba8c07352f947fd25766d0328df2)	2021-06-12 00:02:26 +00:00
Antonio Sanchez	b5fc69bdd8	Add ability to permanently enable HIP/CUDA gpu* defines. When using Eigen for gpu, these simplify portability. If `EIGEN_PERMANENTLY_ENABLE_GPU_HIP_CUDA_DEFINES` is set, then we do not undefine them. (cherry picked from commit 514977f31b1c00b233969f12321a25d859dd1efa)	2021-06-11 17:48:37 +00:00
Antonio Sanchez	4b683b65df	Allow custom TENSOR_CONTRACTION_DISPATCH macro. Currently TF lite needs to hack around with the Tensor headers in order to customize the contraction dispatch method. Here we add simple `#ifndef` guards to allow them to provide their own dispatch prior to inclusion. (cherry picked from commit 6aec83263d32c29f6c5623b9716ec7e367693078)	2021-06-11 17:19:29 +00:00
Rasmus Munk Larsen	1cb1ffd5b2	Use bit_cast to create -0.0 for floating point types to avoid compiler optimization changing sign with --ffast-math enabled. (cherry picked from commit fc87e2cbaa65e7e93a2c695ce5a9dc048a64a985)	2021-06-11 02:57:02 +00:00

1 2 3 4 5 ...

11472 Commits