eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-07-05 20:55:12 +08:00

Author	SHA1	Message	Date
Antonio Sánchez	38fcedaf8e	Fix pexp complex test edge-cases.	2024-03-04 17:44:38 +00:00
Charles Schlosser	8a4118746e	fix exp complex test: use int instead of index	2024-02-17 03:55:32 +00:00
Charles Schlosser	18a161bf17	fix pexp_complex_test	2024-02-17 03:08:23 +00:00
Damiano Franzò	be06c9ad51	Implement float pexp_complex	2024-02-17 00:26:57 +00:00
Antonio Sánchez	f40ad38fda	Fix failure on ARM with latest compilers.	2024-02-14 23:00:56 +00:00
Antonio Sánchez	6ea33f95df	Eliminate warning about writing bytes directly to non-trivial type.	2024-02-12 23:27:48 +00:00
Antonio Sánchez	7b87b21910	Fix UB in bool packetmath test.	2024-02-09 19:46:45 +00:00
Antonio Sánchez	a9ddab3e06	Fix a bunch of ODR violations.	2024-01-30 22:38:43 +00:00
Damiano Franzò	7fd7a3f946	Implement plog_complex	2024-01-30 19:06:05 +00:00
Antonio Sánchez	46e9cdb7fe	Clang-format tests, examples, libraries, benchmarks, etc.	2023-12-05 21:22:55 +00:00
Charles Schlosser	81b48065ea	Fix arm32 float division and related bugs	2023-08-29 00:36:07 +00:00
Pedro Gonnet	17b5b4de58	Add `Packet4ui`, `Packet8ui`, and `Packet4ul` to the `SSE`/`AVX` `PacketMath.h` headers	2023-04-17 23:33:59 +00:00
Antonio Sánchez	394aabb0a3	Fix failing MSVC tests due to compiler bugs.	2023-03-10 22:36:57 +00:00
Rasmus Munk Larsen	ce62177b5b	Vectorize atanh & add a missing definition and unit test for atan.	2023-02-21 03:14:05 +00:00
Antonio Sánchez	8588d8c74b	Correct pnegate for floating-point zero.	2022-11-15 18:07:23 +00:00
Rasmus Munk Larsen	97e0784dc6	Vectorize the sign operator in Eigen.	2022-08-09 19:54:57 +00:00
Antonio Sánchez	39d22ef46b	Fix flaky packetmath_1 test.	2022-08-02 17:42:45 +00:00
Chip Kerchner	84cf3ff18d	Add pload_partial, pstore_partial (and unaligned versions), pgather_partial, pscatter_partial, loadPacketPartial and storePacketPartial.	2022-06-27 19:18:00 +00:00
Erik Schultheis	421cbf0866	Replace Eigen type metaprogramming with corresponding std types and make use of alias templates	2022-03-16 16:43:40 +00:00
Antonio Sánchez	711803c427	Skip denormal test if `Cond` is false.	2022-03-03 04:32:13 +00:00
Antonio Sánchez	9c07e201ff	Modified sqrt/rsqrt for denormal handling.	2022-03-02 17:20:47 +00:00
Antonio Sánchez	2ed4bee78f	Fix frexp packetmath tests for MSVC.	2022-02-24 22:16:37 +00:00
Antonio Sánchez	3d7e2d0e3e	Fix packetmath compilation error.	2022-02-23 23:27:08 +00:00
Antonio Sánchez	8970719771	Fix gcc-5 packetmath_12 bug.	2022-02-23 21:56:25 +00:00
Rasmus Munk Larsen	8b875dbef1	Changes to fast SQRT/RSQRT	2022-02-23 17:32:21 +00:00
Antonio Sánchez	28e008b99a	Fix sqrt/rsqrt for NEON.	2022-02-15 21:31:51 +00:00
Rasmus Munk Larsen	979fdd58a4	Add generic fast psqrt and prsqrt impls and make them correct for 0, +Inf, NaN, and negative arguments.	2022-02-05 00:20:13 +00:00
Rasmus Munk Larsen	7db0ac977a	Remove extraneous ")".	2022-01-27 02:20:03 +00:00
Rasmus Munk Larsen	09c0085a57	Only test pmsub, pnmadd, and pnmsub on signed types.	2022-01-27 02:09:25 +00:00
Rasmus Munk Larsen	8f2c6f0aa6	Make preciprocal IEEE compliant w.r.t. 1/0 and 1/inf.	2022-01-26 20:38:05 +00:00
Rasmus Munk Larsen	51311ec651	Remove inline assembly for FMA (AVX) and add remaining extensions as packet ops: pmsub, pnmadd, and pnmsub.	2022-01-26 04:25:41 +00:00
Rasmus Munk Larsen	ea2c02060c	Add reciprocal packet op and fast specializations for float with SSE, AVX, and AVX512.	2022-01-21 23:49:18 +00:00
Rasmus Munk Larsen	96dc37a03b	Some fixes/cleanups for numeric_limits & fix for related bug in psqrt	2022-01-07 01:10:17 +00:00
Rasmus Munk Larsen	7b5a8b6bc5	Improve plog: 20% speedup for float + handle denormals	2022-01-05 23:40:31 +00:00
Rasmus Munk Larsen	f04fd8b168	Make sure exp(-Inf) is zero for vectorized expressions. This fixes #2385 .	2021-12-08 17:57:23 +00:00
Erik Schultheis	f33a31b823	removed EIGEN_HAS_CXX11_* and redundant EIGEN_COMP_CXXVER checks	2021-11-29 19:18:57 +00:00
Kolja Brix	afa616bc9e	Fix some typos found	2021-09-23 15:22:00 +00:00
Antonio Sanchez	dba753a986	Add missing NEON ptranspose implementations. Unified implementation using only `vzip`.	2021-05-25 18:25:35 +00:00
Jakub Lichman	12471fcb5d	predux_half_dowto4 test extended to all applicable packets	2021-05-21 16:42:19 +00:00
Jakub Lichman	8877f8d9b2	ptranpose test for non-square kernels added	2021-05-19 08:26:45 +00:00
Jakub Lichman	d87648a6be	Tests added and AVX512 bug fixed for pcmp_lt_or_nan	2021-04-25 20:58:56 +00:00
Jakub Lichman	1115f5462e	Tests for pcmp_lt and pcmp_le added	2021-04-23 19:51:43 +00:00
Antonio Sanchez	8dfe1029a5	Augment NumTraits with min/max_exponent() again. Replace usage of `std::numeric_limits<...>::min/max_exponent` in codebase where possible. Also replaced some other `numeric_limits` usages in affected tests with the `NumTraits` equivalent. The previous MR !443 failed for c++03 due to lack of `constexpr`. Because of this, we need to keep around the `std::numeric_limits` version in enum expressions until the switch to c++11. Fixes #2148	2021-03-16 20:12:46 -07:00
David Tellenbach	df4bc2731c	Revert "Augment NumTraits with min/max_exponent()." This reverts commit 75ce9cd2a7aefaaea8543e2db14ce4dc149eeb03.	2021-03-17 03:06:08 +01:00
Antonio Sanchez	75ce9cd2a7	Augment NumTraits with min/max_exponent(). Replace usage of `std::numeric_limits<...>::min/max_exponent` in codebase. Also replaced some other `numeric_limits` usages in affected tests with the `NumTraits` equivalent. Fixes #2148	2021-03-17 01:00:41 +00:00
Antonio Sanchez	2468253c9a	Define EIGEN_CPLUSPLUS and replace most __cplusplus checks. The macro `__cplusplus` is not defined correctly in MSVC unless building with the the `/Zc:__cplusplus` flag. Instead, it defines `_MSVC_LANG` to the specified c++ standard version number. Here we introduce `EIGEN_CPLUSPLUS` which will contain the c++ version number both for MSVC and otherwise. This simplifies checks for supported features. Also replaced most instances of standard version checking via `__cplusplus` with the existing `EIGEN_COMP_CXXVER` macro for better clarity. Fixes: #2170	2021-03-05 18:33:18 +00:00
Antonio Sanchez	82d61af3a4	Fix rint SSE/NEON again, using optimization barrier. This is a new version of !423, which failed for MSVC. Defined `EIGEN_OPTIMIZATION_BARRIER(X)` that uses inline assembly to prevent operations involving `X` from crossing that barrier. Should work on most `GNUC` compatible compilers (MSVC doesn't seem to need this). This is a modified version adapted from what was used in `psincos_float` and tested on more platforms (see #1674, https://godbolt.org/z/73ezTG). Modified `rint` to use the barrier to prevent the add/subtract rounding trick from being optimized away. Also fixed an edge case for large inputs that get bumped up a power of two and ends up rounding away more than just the fractional part. If we are over `2^digits` then just return the input. This edge case was missed in the test since the test was comparing approximate equality, which was still satisfied. Adding a strict equality option catches it.	2021-03-05 08:54:12 -08:00
Antonio Sánchez	9a663973b4	Revert "Fix rint for SSE/NEON." This reverts commit e72dfeb8b9fa5662831b5d0bb9d132521f9173dd	2021-03-03 18:51:51 +00:00
Antonio Sanchez	e72dfeb8b9	Fix rint for SSE/NEON. It seems sometimes with aggressive optimizations the combination `psub(padd(a, b), b)` trick to force rounding is compiled away. Here we replace with inline assembly to prevent this (I tried `volatile`, but that leads to additional loads from memory). Also fixed an edge case for large inputs `a` where adding `b` bumps the value up a power of two and ends up rounding away more than just the fractional part. If we are over `2^digits` then just return the input. This edge case was missed in the test since the test was comparing approximate equality, which was still satisfied. Adding a strict equality option catches it.	2021-03-03 09:41:46 -08:00
Antonio Sanchez	1e0c7d4f49	Add print for SSE/NEON, use NEON rounding intrinsics if available. In SSE, by adding/subtracting 2^MantissaBits, we force rounding according to the current rounding mode. For NEON, we use the provided intrinsics for rint/floor/ceil if available (armv8). Related to #1969.	2021-02-27 22:42:07 +00:00

1 2 3 4 5

249 Commits