eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-07-09 14:41:49 +08:00

Author	SHA1	Message	Date
Antonio Sanchez	5e75331b9f	Fix checking of version number for mingw. MinGW spits out version strings like: `x86_64-w64-mingw32-g++ (GCC) 10-win32 20210110`, which causes the version extraction to fail. Added support for this with tests. Also added `make_unsigned` for `long long`, since mingw seems to use that for `uint64_t`. Related to #2268. CMake and build passes for me after this. (cherry picked from commit ad82d20cf649ba8c07352f947fd25766d0328df2)	2021-06-12 00:02:26 +00:00
Antonio Sanchez	b5fc69bdd8	Add ability to permanently enable HIP/CUDA gpu* defines. When using Eigen for gpu, these simplify portability. If `EIGEN_PERMANENTLY_ENABLE_GPU_HIP_CUDA_DEFINES` is set, then we do not undefine them. (cherry picked from commit 514977f31b1c00b233969f12321a25d859dd1efa)	2021-06-11 17:48:37 +00:00
Antonio Sanchez	4b683b65df	Allow custom TENSOR_CONTRACTION_DISPATCH macro. Currently TF lite needs to hack around with the Tensor headers in order to customize the contraction dispatch method. Here we add simple `#ifndef` guards to allow them to provide their own dispatch prior to inclusion. (cherry picked from commit 6aec83263d32c29f6c5623b9716ec7e367693078)	2021-06-11 17:19:29 +00:00
Rasmus Munk Larsen	1cb1ffd5b2	Use bit_cast to create -0.0 for floating point types to avoid compiler optimization changing sign with --ffast-math enabled. (cherry picked from commit fc87e2cbaa65e7e93a2c695ce5a9dc048a64a985)	2021-06-11 02:57:02 +00:00
Rasmus Munk Larsen	4b502a7215	Fix c++20 warnings about using enums in arithmetic expressions. (cherry picked from commit f64b2954c711b7846ae6ae228c5f14bd8dd56ec4)	2021-06-11 02:35:19 +00:00
Nicolas Cornu	85868564df	Fix parsing of version for nvhpc As the first line of the version is empty it crashes, so delete first line if it is empty (cherry picked from commit 001a57519a7aa909d3bf0cd8c6ec8a9cd19d9c70)	2021-06-10 18:50:22 +00:00
Rohit Santhanam	cbb6ae6296	Removed dead code from GPU float16 unit test. (cherry picked from commit c8d40a7bf1915015c991b108cf2cd6a32138fdc8)	2021-06-10 17:16:47 +00:00
Cyril Kaiser	573570b6c9	Remove EIGEN_DEVICE_FUNC from CwiseBinaryOp's default copy constructor. (cherry picked from commit 91cd67f057f90101cf858d63916ee56a58511b0d)	2021-05-26 19:45:25 +00:00
Antonio Sanchez	98cf1e076f	Add missing NEON ptranspose implementations. Unified implementation using only `vzip`. (cherry picked from commit dba753a986b527a17c8cc62474d0487aec7c2b36)	2021-05-25 19:09:50 +00:00
Antonio Sanchez	ee2a8f7139	Modify Unary/Binary/TernaryOp evaluators to work for non-class types. This used to work for non-class types (e.g. raw function pointers) in Eigen 3.3. This was changed in commit 11f55b29 to optimize the evaluator: > `sizeof((A-B).cwiseAbs2())` with A,B Vector4f is now 16 bytes, instead of 48 before this optimization. though I cannot reproduce the 16 byte result. Both before the change and after, with multiple compilers/versions, I always get a result of 40 bytes. https://godbolt.org/z/MsjTc1PGe This change modifies the code slightly to allow non-class types. The final generated code is identical, and the expression remains 40 bytes for the `abs2` sample case. Fixes #2251 (cherry picked from commit ebb300d0b4340104dcade3afa656a57da2b7660c)	2021-05-25 18:19:53 +00:00
Jakub Lichman	3835046309	predux_half_dowto4 test extended to all applicable packets (cherry picked from commit 12471fcb5d59f969c60a9b78727624dc91e5c04e)	2021-05-21 16:58:16 +00:00
Steve Bronder	4fbd01cd4b	Adds macro for checking if C++14 variable templates are supported (cherry picked from commit 17200570239f23b2f0d3b434bc0269c46c409791)	2021-05-21 16:43:30 +00:00
Niall Murphy	a883a8797c	Use derived object type in conservative_resize_like_impl When calling conservativeResize() on a matrix with DontAlign flag, the temporary variable used to perform the resize should have the same Options as the original matrix to ensure that the correct override of swap is called (i.e. PlainObjectBase::swap(DenseBase<OtherDerived> & other). Calling the base class swap (i.e in DenseBase) results in assertions errors or memory corruption. (cherry picked from commit 391094c50743f28f9174f455661f650bf07e0177)	2021-05-20 23:43:57 +00:00
Jakub Lichman	0bd9e9bc45	ptranpose test for non-square kernels added (cherry picked from commit 8877f8d9b2631301ba070d645cdc3fc9b9f764f5)	2021-05-20 19:27:20 +00:00
Guoqiang QI	77c66e368c	Ensure all generated matrices for inverse_4x4 testes are invertible, this fix #2248 . (cherry picked from commit 3e006bfd31e4389e8c5718c30409cddb65a73b04)	2021-05-13 15:03:47 +00:00
guoqiangqi	2f908f8255	Changing the storage of the SSE complex packets to that of the wrapper. This should fix #2242 . (cherry picked from commit 3d9051ea84a5089b277c88dac456b3b1576bfa7f)	2021-05-12 17:02:19 +00:00
Nathan Luehr	82f13830e6	Fix calls to device functions from host code (cherry picked from commit 972cf0c28a8d2ee0808c1277dea2c5c206591ce6)	2021-05-12 17:01:45 +00:00
Nathan Luehr	d1825cbb68	Device implementation of log for std::complex types. (cherry picked from commit 7e6a1c129c201db4eff46f4dd68acdc7e935eaf2)	2021-05-11 22:31:53 +00:00
Nathan Luehr	d9288f078d	Fix ambiguity due to argument dependent lookup. (cherry picked from commit 6753f0f197e7b8a8019e82e7b144ac0281d6a7f1)	2021-05-11 22:00:36 +00:00
Rohit Santhanam	85ebd6aff8	Fix for issue where numext::imag and numext::real are used before they are defined. (cherry picked from commit 39ec31c0adbdde6b8cda36b3415e9cc2af20dab6)	2021-05-10 20:14:10 +00:00
Antonio Sanchez	2947c0cc84	Restore ABI compatibility for conj with 3.3, fix conflict with boost. The boost library unfortunately specializes `conj` for various types and assumes the original two-template-parameter version. This changes restores the second parameter. This also restores ABI compatibility. The specialization for `std::complex` is because `std::conj` is not a device function. For custom complex scalar types, users should provide their own `conj` implementation. We may consider removing the unnecessary second parameter in the future - but this will require modifying boost as well. Fixes #2112. (cherry picked from commit c0eb5f89a406243f71eae0b705eba4437d9f8565)	2021-05-07 18:38:23 +00:00
Antonio Sanchez	25424f4cf1	Clean up gpu device properties. Made a class and singleton to encapsulate initialization and retrieval of device properties. Related to !481, which already changed the API to address a static linkage issue. (cherry picked from commit 0eba8a1fe3e0fa78f0e6760c0e1265817491845d)	2021-05-07 18:13:40 +00:00
Antonio Sanchez	42acbd5700	Fix numext::arg return type. The cxx11 path for `numext::arg` incorrectly returned the complex type instead of the real type, leading to compile errors. Fixed this and added tests. Related to !477, which uncovered the issue. (cherry picked from commit 90e9a33e1ce3e4e7663dd67e6c1f225afaf5c206)	2021-05-07 17:52:07 +00:00
Christoph Hertzberg	9e0dc8f09b	Revert addition of unused `paddsub<Packet2cf>`. This fixes #2242 (cherry picked from commit 722ca0b665666f3af579002ad752541d7319d1b6)	2021-05-07 16:23:03 +00:00
Antonio Sanchez	da19f7a910	Simplify TensorRandom and remove time-dependence. Time-dependence prevents tests from being repeatable. This has long been an issue with debugging the tensor tests. Removing this will allow future tests to be repeatable in the usual way. Also, the recently added macros in !476 are causing headaches across different platforms. For example, checking `_XOPEN_SOURCE` is leading to multiple ambiguous macro errors across Google, and `_DEFAULT_SOURCE`/`_SVID_SOURCE`/`_BSD_SOURCE` are sometimes defined with values, sometimes defined as empty, and sometimes not defined at all when they probably should be. This is leading to multiple build breakages. The simplest approach is to generate a seed via `Eigen::internal::random<uint64_t>()` if on CPU. For GPU, we use a hash based on the current thread ID (since `rand()` isn't supported on GPU). Fixes #1602. (cherry picked from commit e3b7f59659689015aa254ed67c48d870831f086f)	2021-05-05 23:37:48 +00:00
Antonio Sanchez	fc2cc10842	Better CUDA complex division. The original produced NaNs when dividing 0/b for subnormal b. The `complex_divide_stable` was changed to use the more common Smith's algorithm. (cherry picked from commit 1c013be2cc6a999268be2f25575cd6a07bd52c45)	2021-04-29 17:58:45 +00:00
Antonio Sanchez	a33855f6ee	Add missing pcmp_lt_or_nan for NEON Packet4bf. (cherry picked from commit 172db7bfc32def5ed0f885287e352b63dd5cd767)	2021-04-27 21:15:08 +00:00
Theo Fletcher	83df5df61b	Added complex matrix unit tests for SelfAdjointEigenSolve (cherry picked from commit 2ced0cc233fff6ef16c4d098b03aeeb69ff7c509)	2021-04-26 19:18:53 +00:00
Jakub Lichman	ac3c5aad31	Tests added and AVX512 bug fixed for pcmp_lt_or_nan (cherry picked from commit d87648a6bea315645b893c3815ca8c6bb00ec5d2)	2021-04-26 18:07:55 +00:00
Jakub Lichman	63abb10000	Tests for pcmp_lt and pcmp_le added (cherry picked from commit 1115f5462ecaa84d3c60479f7e23a530a1a415d2)	2021-04-23 19:52:23 +00:00
Turing Eret	baf601a0e3	Fix for issue with static global variables in TensorDeviceGpu.h m_deviceProperties and m_devicePropInitialized are defined as global statics which will define multiple copies which can cause issues if initializeDeviceProp() is called in one translation unit and then m_deviceProperties is used in a different translation unit. Added inline functions getDeviceProperties() and getDevicePropInitialized() which defines those variables as static locals. As per the C++ standard 7.1.2/4, a static local declared in an inline function always refers to the same object, so this should be safer. Credit to Sun Chenggen for this fix. This fixes issue #1475. (cherry picked from commit 3804ca0d905a0a03357db50abc7468f5f90abc98)	2021-04-23 19:06:16 +00:00
Antonio Sanchez	587a691516	Check existence of BSD random before use. `TensorRandom` currently relies on BSD `random()`, which is not always available. The [linux manpage](https://man7.org/linux/man-pages/man3/srandom.3.html) gives the glibc condition: ``` _XOPEN_SOURCE >= 500 \|\| /* Glibc since 2.19: / _DEFAULT_SOURCE \|\| / Glibc <= 2.19: */ _SVID_SOURCE \|\| _BSD_SOURCE ``` In particular, this was failing to compile for MinGW via msys2. If not available, we fall back to using `rand()`. (cherry picked from commit 045c0609b5c059974104f29dad91bcc3828e91ac)	2021-04-23 00:35:05 +00:00
Antonio Sanchez	8830d66c02	DenseStorage safely copy/swap. Fixes #2229. For dynamic matrices with fixed-sized storage, only copy/swap elements that have been set. Otherwise, this leads to inefficient copying, and potential UB for non-initialized elements. (cherry picked from commit d213a0bcea2344aa3f6c9856da9f5b2a26ccec25)	2021-04-22 21:05:50 +00:00
Rasmus Munk Larsen	54425a39b2	Make vectorized compute_inverse_size4 compile with AVX. (cherry picked from commit 85a76a16ea835fcfa7d4c185a338ae2aef9a272a)	2021-04-22 17:25:25 +00:00
Jakub Lichman	34d0be9ec1	Compilation of basicbenchmark fixed (cherry picked from commit d72c794ccd21637ba56dec0dd8bd0cffef7bc47e)	2021-04-21 12:09:42 +02:00
Jakub Lichman	42a8bdd4d7	HasExp added for AVX512 Packet8d (cherry picked from commit 2b1dfd1ba0638e57a50d2f401412e0893064c354)	2021-04-21 12:09:21 +02:00
Chip-Kerchner	28564957ac	Fix taking address of rvalue compiler issue with TensorFlow (plus other warnings). (cherry picked from commit 06c2760bd1139711eeffa30266ead43423891698)	2021-04-21 01:05:21 +00:00
Antonio Sanchez	ab7fe215f9	Fix ldexp for AVX512 (#2215 ) Wrong shuffle was used. Need to interleave low/high halves with a `permute` instruction. Fixes #2215. (cherry picked from commit 1d79c68ba0507574d893780e60b982f07d210261)	2021-04-20 20:52:26 +00:00
David Tellenbach	1f4c0311cd	Bump to 3.3.91 (3.4-rc1) 3.4-rc1	2021-04-18 23:43:12 +02:00
David Tellenbach	3e819d83bf	Before 3.4 branch before-3.4	2021-04-18 23:36:14 +02:00
Antonio Sanchez	69adf26aa3	Modify googlehash use to account for namespace issues. The namespace declaration for googlehash is a configurable macro that can be disabled. In particular, it is disabled within google, causing compile errors since `dense_hash_map`/`sparse_hash_map` are then in the global namespace instead of in `::google`. Here we play a bit of gynastics to allow for both `google::_hash_map` and `_hash_map`, while limiting namespace polution. Symbols within the `::google` namespace are imported into `Eigen::google`. We also remove checks based on `_SPARSE_HASH_MAP_H_`, as this is fragile, and instead require `EIGEN_GOOGLEHASH_SUPPORT` to be defined.	2021-04-12 19:00:39 -07:00
Christoph Hertzberg	9357feedc7	Avoid using uninitialized inputs and if available, use slightly more efficient `movsd` instruction for `pset1<Packet2cf>`.	2021-04-13 01:36:59 +02:00
Rasmus Munk Larsen	a2c0542010	Fix typo in TensorDimensions.h	2021-04-12 18:59:56 +00:00
Rohit Santhanam	dfd6720d82	Fix for float16 GPU unit test.	2021-04-12 10:19:06 +00:00
Christoph Hertzberg	1e1c8a735c	Use EIGEN_HAS_CXX11 and EIGEN_COMP_CXXVER macros to detect C++ version for `std::result_of` and `std::invoke_result`. Fixes #2209	2021-04-12 01:26:15 +00:00
Jens Wehner	f6fc66aa75	fixed doxygen for unsupported iterative solver module	2021-04-11 16:26:14 +00:00
Christoph Hertzberg	d58678069c	Make iterators default constructible and assignable, by making...	2021-04-09 17:03:28 +00:00
Rohit Santhanam	2859db0220	This fixes an issue where the compiler was not choosing the GPU specific specialization of ScanLauncher. The issue was discovered when the GPU scan unit test was run and resulted in a segmentation fault. The segmantation fault occurred because the unit test allocated GPU memory and passed a pointer to that memory to the computation that it presumed would execute on the GPU. But because of the issue, the computation was scheduled to execute on the CPU so a situation was constructed where the CPU attempted to access a GPU memory location. The fix expands the GPU specific ScanLauncher specialization to handle cases where vectorization is enabled. Previously, the GPU specialization is chosen only if Vectorization is not used.	2021-04-08 15:14:48 +00:00
Antonio Sanchez	fcb5106c6e	Scaled epsilon the wrong way. Should have been 0.5 to widen the bounds, since this is inverse precision. Setting to 0.5, however, leads to many more failing tests at Google, so reverting to 1 for now.	2021-04-07 15:08:39 -07:00
Christoph Hertzberg	6197ce1a35	Replace `-2147483648` by `-0.0f` or `-0.0` constants (this should fix #2189 ). Also, remove unnecessary `pgather` operations.	2021-04-07 11:25:27 +00:00

1 2 3 4 5 ...

11476 Commits