eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-07-09 06:31:47 +08:00

Author	SHA1	Message	Date
Alex Druinsky	b0fe14213e	Fix vectorized reductions for Eigen::half Fixes compiler errors in expressions that look like Eigen::Matrix<Eigen::half, 3, 1>::Random().maxCoeff() The error comes from the code that creates the initial value for vectorized reductions. The fix is to specify the scalar type of the reduction's initial value. The cahnge is necessary for Eigen::half because unlike other types, Eigen::half scalars cannot be implicitly created from integers. (cherry picked from commit d0e3791b1a0e2db9edd5f1d1befdb2ac5a40efe0)	2021-11-03 23:29:55 +00:00
Andreas Krebbel	23469c3cda	ZVector: Move alignas qualifier to come first We currently have plenty of type definitions with the alignment qualifier coming after the type. The compiler warns about ignoring them: int EIGEN_ALIGN16 ai[4]; Turn this into: EIGEN_ALIGN16 int ai[4]; (cherry picked from commit 8faafc3aaa2b45e234cfe0bef085c1134ceffc42)	2021-11-03 23:29:10 +00:00
Antonio Sanchez	18824d10ea	Fix ZVector build. Cross-compiled via `s390x-linux-gnu-g++`, run via qemu. This allows the packetmath tests to pass. (cherry picked from commit 40bbe8a4d0eb3ec2bfd472fa30cac19e6e743b46)	2021-11-03 23:28:26 +00:00
Xinle Liu	9c193db5c7	Fix BDCSVD's total deflation in branch 3.4, similar to that of master in MR 707. (cherry picked from commit 4d045eba53f9a32d052eb942448ba62def066529)	2021-11-03 17:58:57 +00:00
Antonio Sanchez	6b6ba41269	Fix min/max nan-propagation for scalar "other". Copied input type from `EIGEN_MAKE_CWISE_BINARY_OP`. Fixes #2362. (cherry picked from commit 03d4cbb30796ea06350414f5f551b180e4864688)	2021-10-28 17:16:49 +00:00
Rasmus Munk Larsen	5d918b82a8	Add nan-propagation options to matrix and array plugins.	2021-10-21 13:48:50 -07:00
Antonio Sanchez	05c9d7ce20	Disable MSVC constant condition warning. We use extensive use of `if (CONSTANT)`, and cannot use c++17's `if constexpr`. (cherry picked from commit 5bf35383e073d218be7a87bdca434be30d231e7e)	2021-10-11 10:00:29 -07:00
Antonio Sanchez	943ef50a2d	Disable testing of complex compound assignment operators for MSVC. MSVC does not support specializing compound assignments for `std::complex`, since it already specializes them (contrary to the standard). Trying to use one of these on device will currently lead to a duplicate definition error. This is still probably preferable to no error though. If we remove the definitions for MSVC, then it will compile, but the kernel will fail silently. The only proper solution would be to define our own custom `Complex` type. (cherry picked from commit f0f1d7938b7083800ff75fe88e15092f08a4e67e)	2021-10-11 10:00:29 -07:00
Antonio Sanchez	7ea4adb5f0	Disable another device warning (cherry picked from commit e9e90892fecb4bebe6473e9de491bfcd6c0de37f)	2021-10-11 10:00:29 -07:00
Antonio Sanchez	71498b32c9	Disable more NVCC warnings. The 2979 warning is yet another "calling a __host__ function from a __host__ device__ function. Although we probably should eventually address these, they are flooding the logs. Most of these are harmless since we only call the original from the host. In cases where these are actually called from device, an error is generated instead anyways. The 2977 warning is a bit strange - although the warning suggests the `__device__` annotation is ignored, this doesn't actually seem to be the case. Without the `__device__` declarations, the kernel actually fails to run when attempting to construct such objects. Again, these warnings are flooding the logs, so disabling for now. (cherry picked from commit 86c0decc480147d109b1dd8b968bcbc509b7a2e6)	2021-10-11 10:00:29 -07:00
Alexander Grund	929bc0e191	Fix alias violation in BFloat16 reinterpret_cast between unrelated types is undefined behavior and leads to misoptimizations on some platforms. Use the safer (and faster) version via bit_cast (cherry picked from commit b5eaa4269503f77d0aa58d2f8ed9419e1ba7784d)	2021-09-20 14:25:58 +00:00
Antonio Sanchez	f046e326d9	Fix strict aliasing bug causing product_small failure. Packet loading is skipped due to aliasing violation, leading to nullopt matrix multiplication. Fixes #2327. (cherry picked from commit 3c724c44cff3f9e2e9e35351abff0b5c022b320d)	2021-09-19 18:06:17 +00:00
Antonio Sanchez	3395f4e604	Fix tridiagonalization_inplace_selector. The `Options` of the new `hCoeffs` vector do not necessarily match those of the `MatrixType`, leading to build errors. Having the `CoeffVectorType` be a template parameter relieves this restriction. (cherry picked from commit ebd4b17d2f5ca29a5c16ebd35d54d7aeda587820)	2021-09-08 15:47:39 +00:00
Antonio Sanchez	f03d3e7072	Missing EIGEN_DEVICE_FUNCs to get `gpu_basic` passing with CUDA 9. CUDA 9 seems to require labelling defaulted constructors as `EIGEN_DEVICE_FUNC`, despite giving warnings that such labels are ignored. Without these labels, the `gpu_basic` test fails to compile, with errors about calling `__host__` functions from `__host__ __device__` functions. (cherry picked from commit 998bab4b04f26552b9875acfe113e69c7adccec4)	2021-09-02 03:21:43 +00:00
Antonio Sanchez	07cc362238	Fix EIGEN_OPTIMIZATION_BARRIER for arm-clang. Clang doesn't like !621, needs the "g" constraint back. The "g" constraint also works for GCC >= 5. This fixes our gitlab CI. (cherry picked from commit 3a6296d4f198ffbcccda4303919b3b14d5e54524)	2021-09-01 16:40:08 +00:00
Antonio Sanchez	4ef67cbfb2	GCC 4.8 arm EIGEN_OPTIMIZATION_BARRIER fix (#2315 ). GCC 4.8 doesn't seem to like the `g` register constraint, failing to compile with "error: 'asm' operand requires impossible reload". Tested `r` instead, and that seems to work, even with latest compilers. Also fixed some minor macro issues to eliminate warnings on armv7. Fixes #2315. (cherry picked from commit ff07a8a63945d89301d1b29ac59d170ff9be3955)	2021-08-31 21:23:28 +00:00
Antonio Sanchez	c2b6df6e60	Disable cuda Eigen::half vectorization on host. All cuda `__half` functions are device-only in CUDA 9, including conversions. Host-side conversions were added in CUDA 10. The existing code doesn't build prior to 10.0. All arithmetic functions are always device-only, so there's therefore no reason to use vectorization on the host at all. Modified the code to disable vectorization for `__half` on host, which required also updating the `TensorReductionGpu` implementation which previously made assumptions about available packets. (cherry picked from commit cc3573ab4451853774cd5c3497373d5fe8914774)	2021-08-31 21:23:11 +00:00
Adam Kallai	277d369060	win: include intrin header in Windows on ARM intrin header is needed for _BitScanReverse and _BitScanReverse64 (cherry picked from commit 1415817d8daa7fa72ec9b26a6b9d166a1d54626a)	2021-08-31 21:22:37 +00:00
Antonio Sanchez	7aee90b8d3	Fix fix<N> when variable templates are not supported. There were some typos that checked `EIGEN_HAS_CXX14` that should have checked `EIGEN_HAS_CXX14_VARIABLE_TEMPLATES`, causing a mismatch in some of the `Eigen::fix<N>` assumptions. Also fixed the `symbolic_index` test when `EIGEN_HAS_CXX14_VARIABLE_TEMPLATES` is 0. Fixes #2308 (cherry picked from commit 5db9e5c77958997856ddbccfa4a52ff22e83bef9)	2021-08-30 16:23:35 +00:00
Rasmus Munk Larsen	3147391d94	Change version to 3.4.0.	2021-08-18 13:41:58 -07:00
Antonio Sanchez	115591b9e3	Workaround VS 2017 arg bug. In VS 2017, `std::arg` for real inputs always returns 0, even for negative inputs. It should return `PI` for negative real values. This seems to be fixed in VS 2019 (MSVC 1920). (cherry picked from commit 2b410ecbefea1bf4b9d50decb946a4ebe4a73f98)	2021-08-18 19:04:50 +00:00
Jakob Struye	1ec173b54e	Clearer doc for squaredNorm (cherry picked from commit 53a29c7e351646efe31ee85666c8f268f8e0d462)	2021-08-18 15:12:36 +00:00
Antonio Sanchez	aef926abf6	Renamed shift_left/shift_right to shiftLeft/shiftRight. For naming consistency. Also moved to ArrayCwiseUnaryOps, and added test. (cherry picked from commit fc9d352432b81210f73d71caecbd7dc5505d6ab8)	2021-08-18 14:44:31 +00:00
Antonio Sanchez	f1032255d3	Add missing PPC packet comparisons. This is to fix the packetmath tests on the ppc pipeline. (cherry picked from commit 2cc6ee0d2e76e88fe1476f6b0eae12edb68b1c8a)	2021-08-17 15:33:55 +00:00
Chip-Kerchner	f57dec64ef	Fix unaligned loads in ploadLhs & ploadRhs for P8. (cherry picked from commit 8dcf3e38ba9913021ce6a831836a59217e21baf2)	2021-08-17 12:48:36 +00:00
andiwand	cd474d4cd0	minor doc fix in Map.h (cherry picked from commit 5c6b3efead69636dec1599aa54dab4617755013c)	2021-08-16 14:26:39 +00:00
Chip-Kerchner	0b56b62f30	Reverse compare logic in F32ToBf16 since vec_cmpne is not available in Power8 - now compiles for clang10 default (P8). (cherry picked from commit e07227c411cb5ed5c6252b594fe841867bd19f6a)	2021-08-13 18:01:15 +00:00
Chip Kerchner	44cc96e1a1	Get rid of used uninitialized warnings for EIGEN_UNUSED_VARIABLE in gcc11+ (cherry picked from commit 66499f0f172d0758360043e9c578761c0f7d50cd)	2021-08-12 21:39:17 +00:00
Rasmus Munk Larsen	6d2506040c	* revise the meta_least_common_multiple function template, add a bool variable to check whether the A is larger than B. * This can make less compile_time if A is smaller than B. and avoid failure in compile if we get a little A and a great B. Authored by @awoniu. (cherry picked from commit 8ce341caf2947e4b5ac4580c20254ae7d828b009)	2021-08-11 18:11:26 +00:00
ChipKerchner	13d7658c5d	Fix errors on older compilers (gcc 7.5 - lack of vec_neg, clang10 - can not use const pointers with vec_xl). (cherry picked from commit 413bc491f1721afdb9802553b13a5b7aba67ed3b)	2021-08-10 20:40:54 +00:00
Gauri Deshpande	93bff85a42	remove denormal flushing in fp32tobf16 for avx & avx512 (cherry picked from commit e6a5a594a7f3cbe2f9843d4ef57a10d478cbb818)	2021-08-09 22:15:42 +00:00
Rasmus Munk Larsen	4e0357c6dd	Avoid memory allocation in tridiagonalization_inplace_selector::run. (cherry picked from commit a5a7faeb455efd7f6edb1138eda2e37546039b7d)	2021-08-06 21:48:00 +00:00
Antonio Sanchez	5b83d3c4bc	Make inverse 3x3 faster and avoid gcc bug. There seems to be a gcc 4.7 bug that incorrectly flags the current 3x3 inverse as using uninitialized memory. I'm pretty sure it's a false positive, but it's hard to trigger. The same warning does not trigger with clang or later compiler versions. In trying to find a work-around, this implementation turns out to be faster anyways for static-sized matrices. ``` name old cpu/op new cpu/op delta BM_Inverse3x3<DynamicMatrix3T<float>> 423ns ± 2% 433ns ± 3% +2.32% (p=0.000 n=98+96) BM_Inverse3x3<DynamicMatrix3T<double>> 425ns ± 2% 427ns ± 3% +0.48% (p=0.003 n=99+96) BM_Inverse3x3<StaticMatrix3T<float>> 7.10ns ± 2% 0.80ns ± 1% -88.67% (p=0.000 n=114+112) BM_Inverse3x3<StaticMatrix3T<double>> 7.45ns ± 2% 1.34ns ± 1% -82.01% (p=0.000 n=105+111) BM_AliasedInverse3x3<DynamicMatrix3T<float>> 409ns ± 3% 419ns ± 3% +2.40% (p=0.000 n=100+98) BM_AliasedInverse3x3<DynamicMatrix3T<double>> 414ns ± 3% 413ns ± 2% ~ (p=0.322 n=98+98) BM_AliasedInverse3x3<StaticMatrix3T<float>> 7.57ns ± 1% 0.80ns ± 1% -89.37% (p=0.000 n=111+114) BM_AliasedInverse3x3<StaticMatrix3T<double>> 9.09ns ± 1% 2.58ns ±41% -71.60% (p=0.000 n=113+116) ``` (cherry picked from commit 5ad8b9bfe2bf75620bc89467c5cc051fc2a597df)	2021-08-04 22:06:52 +00:00
Antonio Sanchez	237c59a2aa	Modify scalar pzero, ptrue, pselect, and p<binary> operations to avoid memset. The `memset` function and bitwise manipulation only apply to POD types that do not require initialization, otherwise resulting in UB. We currently violate this in `ptrue` and `pzero`, we assume bitmasks for `pselect`, and bitwise operations are applied byte-by-byte in the generic implementations. This is causing issues for scalar types that do require initialization or that contain non-POD info such as pointers (#2201). We either break them, or force specializations of these functions for custom scalars, even if they are not vectorized. Here we modify these functions for scalars only - instead using only scalar operations: - `pzero`: `Scalar(0)` for all scalars. - `ptrue`: `Scalar(1)` for non-trivial scalars, bitset to one bits for trivial scalars. - `pselect`: ternary select comparing mask to `Scalar(0)` for all scalars - `pand`, `por`, `pxor`, `pnot`: use operators `&`, `\|`, `^`, `~` for all integer or non-trivial scalars, otherwise apply bytewise. For non-scalar types, the original implementations are used to maintain compatibility and minimize the number of changes. Fixes #2201. (cherry picked from commit 3d98a6ef5ce0ba85acaee4ffffc53f0f21bd8fd2)	2021-08-03 16:32:59 +00:00
Antonio Sanchez	3dc42eeaec	Enable equality comparisons on GPU. Since `std::equal_to::operator()` is not a device function, it fails on GPU. On my device, I seem to get a silent crash in the kernel (no reported error, but the kernel does not complete). Replacing this with a portable version enables comparisons on device. Addresses #2292 - would need to be cherry-picked. The 3.3 branch also requires adding `EIGEN_DEVICE_FUNC` in `BooleanRedux.h` to get fully working. (cherry picked from commit 7880f10526a11dc5544426c54c5763de576bf285)	2021-08-03 16:15:44 +00:00
hyunggi-sv	7adc1545b4	fix:typo in dox (has->have) (cherry picked from commit 02a0e79c701da7aa8dfad79b13cd1e7fae46d634)	2021-08-03 00:54:41 +00:00
Antonio Sanchez	c0c7b695cd	Fix assignment operator issue for latest MSVC+NVCC. Details are scattered across #920, #1000, #1324, #2291. Summary: some MSVC versions have a bug that requires omitting explicit `operator=` definitions (leads to duplicate definition errors), and some MSVC versions require adding explicit `operator=` definitions (otherwise implicitly deleted errors). This mess tries to cover all the cases encountered. Fixes #2291. (cherry picked from commit 9816fe59b47dc4c07967b5ee93a8e8aaa6e9c308)	2021-08-03 00:52:21 +00:00
arthurfeeney	9c90d5d832	Fixes #1387 for compilation error in JacobiSVD with HouseholderQRPreconditioner that occurs when input is a compile-time row vector. (cherry picked from commit a77638387dd1aa2d07d2dae240cc30b303b4ef38)	2021-07-22 18:01:55 +00:00
Antonio Sanchez	5d37114fc0	Fix explicit default cache size typo. (cherry picked from commit 297f0f563d916260665d7fadc017f94f1a5e7a03)	2021-07-20 18:42:25 +00:00
Rohit Santhanam	930696fc53	Enable extract et. al. for HIP GPU. (cherry picked from commit beea14a18f76817439b4d8901d29db2e9c4a24c8)	2021-07-09 16:14:19 +00:00
Rasmus Munk Larsen	56966fd2e6	Defer to std::fill_n when filling a dense object with a constant value. (cherry picked from commit 0c361c4899c9042d2b25cd60d7826ab464caacb7)	2021-07-09 03:59:56 +00:00
Guoqiang QI	69ec4907da	Make a copy of input matrix when try to do the inverse in place, this fixes #2285 . (cherry picked from commit 4bcd42c271761dc5341f8e08ca7d357c3614cb01)	2021-07-08 17:07:54 +00:00
Rasmus Munk Larsen	05bab8139a	Fix breakage of conj_helper in conjunction with custom types introduced in !537 . (cherry picked from commit 7b35638ddb99a0298c5d3450de506a8e8e0203d3)	2021-07-02 20:59:50 +00:00
Chip Kerchner	eebde572d9	Create the ability to disable the specialized gemm_pack_rhs in Eigen (only PPC) for TensorFlow (cherry picked from commit 91e99ec1e02100d07e35a7abb1b5c76707237219)	2021-07-01 23:32:38 +00:00
Antonio Sanchez	8190739f12	Fix compile issues for gcc 4.8. - Move constructors can only be defaulted as NOEXCEPT if all members have NOEXCEPT move constructors. - gcc 4.8 has some funny parsing bug in `a < b->c`, thinking `b-` is a template parameter. (cherry picked from commit 6035da5283f12f7e6a49cda0c21696c8e5a115b7)	2021-07-01 23:18:10 +00:00
Antonio Sanchez	b6db013435	Fix inverse nullptr/asan errors for LU. For empty or single-column matrices, the current `PartialPivLU` currently dereferences a `nullptr` or accesses memory out-of-bounds. Here we adjust the checks to avoid this. (cherry picked from commit 154f00e9eacaec5667215784c7601b55024e2f61)	2021-07-01 22:57:25 +00:00
Dan Miller	1f6b1c1a1f	Fix duplicate definitions on Mac (cherry picked from commit eb047759030558acf0764d5d2f913f4f84cf85a8)	2021-07-01 20:49:05 +00:00
Alexander Karatarakis	517294d6e1	Make DenseStorage<> trivially_copyable (cherry picked from commit 60400334a92268272c6bf525da89eec5e99c3e5a)	2021-07-01 20:48:47 +00:00
大河メタル	94e2250b36	Correct declarations for aarch64-pc-windows-msvc (cherry picked from commit c81da59a252b3479753b2eada26ee0cf46280bd0)	2021-06-30 04:10:04 +00:00
Rasmus Munk Larsen	380d0e4916	Get rid of redundant `pabs` instruction in complex square root. (cherry picked from commit 5aebbe9098f53f01c99eed67b52725397e955280)	2021-06-29 23:27:09 +00:00

1 2 3 4 5 ...

6601 Commits