eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-07-16 18:11:47 +08:00

Author	SHA1	Message	Date
Antonio Sanchez	1217390db4	Fix windows+CUDA builds	2023-10-25 20:55:59 +00:00
Antonio Sanchez	ac561cd038	Reduce tensor_contract_gpu test. The original test times out after 60 minutes on Windows, even when setting flags to optimize for speed. Reducing the number of contractions performed from 3600->27 for subtests 8,9 allow the two to run in just over a minute each. (cherry picked from commit be9e7d205f38e3e8effdfdded88817b371673930)	2023-07-11 11:27:31 -07:00
Antonio Sanchez	89a71f3126	Fix gpu special function tests. Some checks used incorrect values, partly from copy-paste errors, partly from the change in behaviour introduced in !398. Modified results to match scipy, simplified tests by updating `VERIFY_IS_CWISE_APPROX` to work for scalars. (cherry picked from commit 701f5d1c91c770e558c7760da14ff3365757e275)	2023-07-10 15:57:08 -07:00
Antonio Sanchez	a605d6b996	Rename EIGEN_CUDA_FLAGS to EIGEN_CUDA_CXX_FLAGS Also add a missing space for clang. (cherry picked from commit 846d34384af80b80793d32257a7f917eeece41d4)	2023-07-10 15:30:41 -07:00
Antonio Sanchez	dfcd6de20a	Clean up CUDA CMake files. - Unify test/CMakeLists.txt and unsupported/test/CMakeLists.txt - Added `EIGEN_CUDA_FLAGS` that are appended to the set of flags passed to the cuda compiler (nvcc or clang). The latter is to support passing custom flags (e.g. `-arch=` to nvcc, or to disable cuda-specific warnings). (cherry picked from commit 7b00e8b186a7679b0f46be742809a55d07d4efe8)	2023-07-10 15:30:41 -07:00
Antonio Sánchez	26b8fabd80	Return NaN in ndtri for values outside valid input range. (cherry picked from commit 1f79a6078fb77da47069c8aec23c4e309fb982e2)	2023-07-10 14:52:08 -07:00
Antonio Sánchez	b158fcaa74	Fix edge-case in zeta for large inputs. (cherry picked from commit 9296bb4b933973365d19b4b71e7d2b205d00a1ad)	2023-07-07 15:06:18 -07:00
Antonio Sánchez	b30a2a527e	Remove poor non-convergence checks in NonLinearOptimization. (cherry picked from commit d819a33bf64c4fce95c55f8e44a68b486f064a79)	2023-07-07 11:50:25 -07:00
Antonio Sanchez	bc1b354b32	Adjust tolerance of matrix_power test for MSVC. (cherry picked from commit 1c2690ed248327539f7a248ddb12e1690da81b68)	2023-07-07 11:50:02 -07:00
Antonio Sánchez	36be6747e0	Modify test expression to avoid numerical differences (#2402 ). (cherry picked from commit ae86a146b1ac9a49bf72e485254c08d237fd094a)	2023-07-07 11:45:56 -07:00
Antonio Sanchez	0ab1f8ec03	Fix broadcasting oob error. For vectorized 1-dimensional inputs that do not take the special blocking path (e.g. `std::complex<...>`), there was an index-out-of-bounds error causing the broadcast size to be computed incorrectly. Here we fix this, and make other minor cleanup changes. Fixes #2351. (cherry picked from commit a500da1dc089b08e2f2b3b05a2eb23194425460e)	2021-11-03 23:30:47 +00:00
Maxiwell S. Garcia	b8cf1ed753	Rename 'vec_all_nan' of cxx11_tensor_expr test because this symbol is used by altivec.h (cherry picked from commit 09fc0f97b53e22d8fef94acf0fbfeed3717ab906)	2021-09-01 17:26:59 +00:00
jenswehner	338924602d	added includes for unordered_map (cherry picked from commit e3e74001f7c4bf95f0dde572e8a08c5b2918a3ab)	2021-08-10 16:10:03 +00:00
Antonio Sanchez	46ecdcd745	Fix MPReal detection and support. The latest version of `mpreal` has a bug that breaks `min`/`max`. It also breaks with the latest dev version of `mpfr`. Here we add `FindMPREAL.cmake` which searches for the library and tests if compilation works. Removed our internal copy of `mpreal.h` under `unsupported/test`, as it is out-of-sync with the latest, and similarly breaks with the latest `mpfr`. It would be best to use the installed version of `mpreal` anyways, since that's what we actually want to test. Fixes #2282. (cherry picked from commit 31f796ebef35eeadd0e26878aab3fe99ca412a45)	2021-08-03 18:13:12 +00:00
Antonio Sanchez	bb33880e57	Fix TriSycl CMake files. This is to enable compiling with the latest trisycl. `FindTriSYCL.cmake` was broken by commit 00f32752, which modified `add_sycl_to_target` for ComputeCPP. This makes the corresponding modifications for trisycl to make them consistent. Also, trisycl now requires c++17. (cherry picked from commit 8cf6cb27baa9607cc00e5dbb42a1c31efda41b74)	2021-08-03 17:25:17 +00:00
Jonas Harsch	601814b575	Don't crash when attempting to shuffle an empty tensor. (cherry picked from commit aab747021be5ed1a1e9667243d884eb72003599d)	2021-07-02 21:08:38 +00:00
Rohit Santhanam	cbb6ae6296	Removed dead code from GPU float16 unit test. (cherry picked from commit c8d40a7bf1915015c991b108cf2cd6a32138fdc8)	2021-06-10 17:16:47 +00:00
Antonio Sanchez	69adf26aa3	Modify googlehash use to account for namespace issues. The namespace declaration for googlehash is a configurable macro that can be disabled. In particular, it is disabled within google, causing compile errors since `dense_hash_map`/`sparse_hash_map` are then in the global namespace instead of in `::google`. Here we play a bit of gynastics to allow for both `google::_hash_map` and `_hash_map`, while limiting namespace polution. Symbols within the `::google` namespace are imported into `Eigen::google`. We also remove checks based on `_SPARSE_HASH_MAP_H_`, as this is fragile, and instead require `EIGEN_GOOGLEHASH_SUPPORT` to be defined.	2021-04-12 19:00:39 -07:00
Rohit Santhanam	dfd6720d82	Fix for float16 GPU unit test.	2021-04-12 10:19:06 +00:00
Jens Wehner	c0a889890f	Fixed output of complex matrices	2021-03-15 21:51:55 +00:00
Antonio Sanchez	543e34ab9d	Re-implement move assignments. The original swap approach leads to potential undefined behavior (reading uninitialized memory) and results in unnecessary copying of data for static storage. Here we pass down the move assignment to the underlying storage. Static storage does a one-way copy, dynamic storage does a swap. Modified the tests to no longer read from the moved-from matrix/tensor, since that can lead to UB. Added a test to ensure we do not access uninitialized memory in a move. Fixes: #2119	2021-03-10 16:55:20 +00:00
Antonio Sanchez	2468253c9a	Define EIGEN_CPLUSPLUS and replace most __cplusplus checks. The macro `__cplusplus` is not defined correctly in MSVC unless building with the the `/Zc:__cplusplus` flag. Instead, it defines `_MSVC_LANG` to the specified c++ standard version number. Here we introduce `EIGEN_CPLUSPLUS` which will contain the c++ version number both for MSVC and otherwise. This simplifies checks for supported features. Also replaced most instances of standard version checking via `__cplusplus` with the existing `EIGEN_COMP_CXXVER` macro for better clarity. Fixes: #2170	2021-03-05 18:33:18 +00:00
Jens Wehner	4bfcee47b9	Idrs iterative linear solver	2021-02-27 12:09:33 +00:00
Rasmus Munk Larsen	f284c8592b	Don't crash when attempting to slice an empty tensor.	2021-02-24 18:12:51 -08:00
Antonio Sanchez	119763cf38	Eliminate CMake FindPackageHandleStandardArgs warnings. CMake complains that the package name does not match when the case differs, e.g.: ``` CMake Warning (dev) at /usr/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:273 (message): The package name passed to `find_package_handle_standard_args` (UMFPACK) does not match the name of the calling package (Umfpack). This can lead to problems in calling code that expects `find_package` result variables (e.g., `_FOUND`) to follow a certain pattern. Call Stack (most recent call first): cmake/FindUmfpack.cmake:50 (find_package_handle_standard_args) bench/spbench/CMakeLists.txt:24 (find_package) This warning is for project developers. Use -Wno-dev to suppress it. ``` Here we rename the libraries to match their true cases.	2021-02-24 09:52:05 +00:00
Antonio Sanchez	5f9cfb2529	Add missing adolc isinf/isnan. Also modified cmake/FindAdolc.cmake to eliminate warnings, and added search paths to match install layout. Fixed: #2157	2021-02-19 22:26:56 +00:00
frgossen	33e0af0130	Return nan at poles of polygamma, digamma, and zeta if limit is not defined	2021-02-19 16:35:11 +00:00
Gmc2	a4edb1079c	fix test of ExtractVolumePatchesOp	2021-01-25 03:23:46 +00:00
Alexander Grund	cf0b5b0344	Remove code checking for CMake < 3.5 As the CMake version is at least 3.5 the code checking for earlier versions can be removed.	2020-12-14 09:57:44 +00:00
Antonio Sanchez	e2f21465fe	Special function implementations for half/bfloat16 packets. Current implementations fail to consider half-float packets, only half-float scalars. Added specializations for packets on AVX, AVX512 and NEON. Added tests to `special_packetmath`. The current `special_functions` tests would fail for half and bfloat16 due to lack of precision. The NEON tests also fail with precision issues and due to different handling of `sqrt(inf)`, so special functions bessel, ndtri have been disabled. Tested with AVX, AVX512.	2020-12-04 10:16:29 -08:00
Rasmus Munk Larsen	71c85df4c1	Clean up the Tensor header and get rid of the EIGEN_SLEEP macro.	2020-12-02 11:04:04 -08:00
Antonio Sanchez	22f67b5958	Fix boolean float conversion and product warnings. This fixes some gcc warnings such as: ``` Eigen/src/Core/GenericPacketMath.h:655:63: warning: implicit conversion turns floating-point number into bool: 'typename __gnu_cxx::__enable_if<__is_integer<bool>::__value, double>::__type' (aka 'double') to 'bool' [-Wimplicit-conversion-floating-point-to-bool] Packet psqrt(const Packet& a) { EIGEN_USING_STD(sqrt); return sqrt(a); } ``` Details: - Added `scalar_sqrt_op<bool>` (`-Wimplicit-conversion-floating-point-to-bool`). - Added `scalar_square_op<bool>` and `scalar_cube_op<bool>` specializations (`-Wint-in-bool-context`) - Deprecated above specialized ops for bool. - Modified `cxx11_tensor_block_eval` to specialize generator for booleans (`-Wint-in-bool-context`) and to use `abs` instead of `square` to avoid deprecated bool ops.	2020-11-24 20:20:36 +00:00
Antonio Sanchez	a8fdcae55d	Fix sparse_extra_3, disable counting temporaries for testing DynamicSparseMatrix. Multiplication of column-major `DynamicSparseMatrix`es involves three temporaries: - two for transposing twice to sort the coefficients (`ConservativeSparseSparseProduct.h`, L160-161) - one for a final copy assignment (`SparseAssign.h`, L108) The latter is avoided in an optimization for `SparseMatrix`. Since `DynamicSparseMatrix` is deprecated in favor of `SparseMatrix`, it's not worth the effort to optimize further, so I simply disabled counting temporaries via a macro. Note that due to the inclusion of `sparse_product.cpp`, the `sparse_extra` tests actually re-run all the original `sparse_product` tests as well. We may want to simply drop the `DynamicSparseMatrix` tests altogether, which would eliminate the test duplication. Related to #2048	2020-11-18 23:15:33 +00:00
Antonio Sanchez	17268b155d	Add bit_cast for half/bfloat to/from uint16_t, fix TensorRandom The existing `TensorRandom.h` implementation makes the assumption that `half` (`bfloat16`) has a `uint16_t` member `x` (`value`), which is not always true. This currently fails on arm64, where `x` has type `__fp16`. Added `bit_cast` specializations to allow casting to/from `uint16_t` for both `half` and `bfloat16`. Also added tests in `half_float`, `bfloat16_float`, and `cxx11_tensor_random` to catch these errors in the future.	2020-11-18 20:32:35 +00:00
Antonio Sanchez	852513e7a6	Disable testing of OpenGL by default. The `OpenGLSupport` module contains mostly deprecated features, and the test is highly GL context-dependent, relies on deprecated GLUT, and requires a display. Until the module is updated to support modern OpenGL and the test to use newer windowing frameworks (e.g. GLFW) it's probably best to disable the test by default. The test can be enabled with `cmake -DEIGEN_TEST_OPENGL=ON`. See #2053 for more details.	2020-11-12 16:15:40 -08:00
Antonio Sanchez	6961468915	Address issues with `openglsupport` test. The existing test fails on several systems due to GL runtime version mismatches, the use of deprecated features, and memory errors due to improper use of GLUT. The test was modified to: - Run within a display function, allowing proper GLUT cleanup. - Generate dynamic shaders with a supported GLSL version string and output variables. - Report shader compilation errors. - Check GL context version before launching version-specific tests. Note that most of the existing `OpenGLSupport` module and tests rely on deprecated features (e.g. fixed-function pipeline). The test was modified to allow it to pass on various systems. We might want to consider removing the module or re-writing it entirely to support modern OpenGL. This is beyond the scope of this patch. Testing of legacy GL (for platforms that support it) can be enabled by defining `EIGEN_LEGACY_OPENGL`. Otherwise, the test will try to create a modern context. Tested on - MacBook Air (2019), macOS Catalina 10.15.7 (OpenGL 2.1, 4.1) - Debian 10.6, NVidia Quadro K1200 (OpenGL 3.1, 3.3)	2020-11-11 15:54:43 -08:00
Deven Desai	9d11e2c03e	CMakefile update for ROCm 4.0 Starting with ROCm 4.0, the `hipconfig --platform` command will return `amd` (prior return value was `hcc`). Updating the CMakeLists.txt files in the test dirs to account for this change.	2020-10-29 18:06:31 +00:00
Rasmus Munk Larsen	274ef12b61	Remove leftover debug print statement in cxx11_tensor_expr.cpp	2020-10-14 22:59:51 +00:00
Rasmus Munk Larsen	c6953f799b	Add packet generic ops `predux_fmin`, `predux_fmin_nan`, `predux_fmax`, and `predux_fmax_nan` that implement reductions with `PropagateNaN`, and `PropagateNumbers` semantics. Add (slow) generic implementations for most reductions.	2020-10-13 21:48:31 +00:00
Rasmus Munk Larsen	b431024404	Don't make assumptions about NaN-propagation for pmin/pmax - it various across platforms. Change test to only test for NaN-propagation for pfmin/pfmax.	2020-10-07 19:05:18 +00:00
David Tellenbach	d4a727d092	Disable min/max NaN propagation in test cxx11_tensor_expr The current pmin/pmax implementation for Arm Neon propagate NaNs differently than std::min/std::max. See issue https://gitlab.com/libeigen/eigen/-/issues/1937	2020-08-14 16:16:27 +00:00
Rasmus Munk Larsen	ac2eca6b11	Update tensor reduction test to avoid undefined division of bfloat16 by int.	2020-07-22 00:35:51 +00:00
Antonio Sanchez	9cb8771e9c	Fix tensor casts for large packets and casts to/from std::complex The original tensor casts were only defined for `SrcCoeffRatio`:`TgtCoeffRatio` 1:1, 1:2, 2:1, 4:1. Here we add the missing 1:N and 8:1. We also add casting `Eigen::half` to/from `std::complex<T>`, which was missing to make it consistent with `Eigen:bfloat16`, and generalize the overload to work for any complex type. Tests were added to `basicstuff`, `packetmath`, and `cxx11_tensor_casts` to test all cast configurations.	2020-06-30 18:53:55 +00:00
Teng Lu	386d809bde	Support BFloat16 in Eigen	2020-06-20 19:16:24 +00:00
Antonio Sanchez	a7d2552af8	Remove HasCast and fix packetmath cast tests. The use of the `packet_traits<>::HasCast` field is currently inconsistent with `type_casting_traits<>`, and is unused apart from within `test/packetmath.cpp`. In addition, those packetmath cast tests do not currently reflect how casts are performed in practice: they ignore the `SrcCoeffRatio` and `TgtCoeffRatio` fields, assuming a 1:1 ratio. Here we remove the unsed `HasCast`, and modify the packet cast tests to better reflect their usage.	2020-06-11 17:26:56 +00:00
Thales Sabino	1fcaaf460f	Update FindComputeCpp.cmake to fix build problems on Windows - Use standard types in SYCL/PacketMath.h to avoid compilation problems on Windows - Add EIGEN_HAS_CONSTEXPR to cxx11_tensor_argmax_sycl.cpp to fix build problems on Windows	2020-06-05 20:51:20 +00:00
Antonio Sánchez	8719b9c5bc	Disable test for 32-bit systems (e.g. ARM, i386) Both i386 and 32-bit ARM do not define __uint128_t. On most systems, if __uint128_t is defined, then so is the macro __SIZEOF_INT128__. https://stackoverflow.com/questions/18531782/how-to-know-if-uint128-t-is-defined1	2020-05-28 17:40:15 +00:00
Rasmus Munk Larsen	ab773c7e91	Extend support for Packet16b: * Add ptranspose<,4> to support matmul and add unit test for Matrix<bool> Matrix<bool> * work around a bug in slicing of Tensor<bool>. * Add tensor tests This speeds up matmul for boolean matrices by about 10x name old time/op new time/op delta BM_MatMul<bool>/8 267ns ± 0% 479ns ± 0% +79.25% (p=0.008 n=5+5) BM_MatMul<bool>/32 6.42µs ± 0% 0.87µs ± 0% -86.50% (p=0.008 n=5+5) BM_MatMul<bool>/64 43.3µs ± 0% 5.9µs ± 0% -86.42% (p=0.008 n=5+5) BM_MatMul<bool>/128 315µs ± 0% 44µs ± 0% -85.98% (p=0.008 n=5+5) BM_MatMul<bool>/256 2.41ms ± 0% 0.34ms ± 0% -85.68% (p=0.008 n=5+5) BM_MatMul<bool>/512 18.8ms ± 0% 2.7ms ± 0% -85.53% (p=0.008 n=5+5) BM_MatMul<bool>/1k 149ms ± 0% 22ms ± 0% -85.40% (p=0.008 n=5+5)	2020-04-28 16:12:47 +00:00
Rasmus Munk Larsen	2f6ddaa25c	Add partial vectorization for matrices and tensors of bool. This speeds up boolean operations on Tensors by up to 25x. Benchmark numbers for the logical and of two NxN tensors: name old time/op new time/op delta BM_booleanAnd_1T/3 [using 1 threads] 14.6ns ± 0% 14.4ns ± 0% -0.96% BM_booleanAnd_1T/4 [using 1 threads] 20.5ns ±12% 9.0ns ± 0% -56.07% BM_booleanAnd_1T/7 [using 1 threads] 41.7ns ± 0% 10.5ns ± 0% -74.87% BM_booleanAnd_1T/8 [using 1 threads] 52.1ns ± 0% 10.1ns ± 0% -80.59% BM_booleanAnd_1T/10 [using 1 threads] 76.3ns ± 0% 13.8ns ± 0% -81.87% BM_booleanAnd_1T/15 [using 1 threads] 167ns ± 0% 16ns ± 0% -90.45% BM_booleanAnd_1T/16 [using 1 threads] 188ns ± 0% 16ns ± 0% -91.57% BM_booleanAnd_1T/31 [using 1 threads] 667ns ± 0% 34ns ± 0% -94.83% BM_booleanAnd_1T/32 [using 1 threads] 710ns ± 0% 35ns ± 0% -95.01% BM_booleanAnd_1T/64 [using 1 threads] 2.80µs ± 0% 0.11µs ± 0% -95.93% BM_booleanAnd_1T/128 [using 1 threads] 11.2µs ± 0% 0.4µs ± 0% -96.11% BM_booleanAnd_1T/256 [using 1 threads] 44.6µs ± 0% 2.5µs ± 0% -94.31% BM_booleanAnd_1T/512 [using 1 threads] 178µs ± 0% 10µs ± 0% -94.35% BM_booleanAnd_1T/1k [using 1 threads] 717µs ± 0% 78µs ± 1% -89.07% BM_booleanAnd_1T/2k [using 1 threads] 2.87ms ± 0% 0.31ms ± 1% -89.08% BM_booleanAnd_1T/4k [using 1 threads] 11.7ms ± 0% 1.9ms ± 4% -83.55% BM_booleanAnd_1T/10k [using 1 threads] 70.3ms ± 0% 17.2ms ± 4% -75.48%	2020-04-20 20:16:28 +00:00
Aaron Franke	5c22c7a7de	Make file formatting comply with POSIX and Unix standards UTF-8, LF, no BOM, and newlines at the end of files	2020-03-23 18:09:02 +00:00

1 2 3 4 5 ...

1163 Commits