eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-10-17 10:31:28 +08:00

Author	SHA1	Message	Date
Antonio Sanchez	634bd79b0e	Fix unused warning on new `dense_assignment_loop` impl.	2020-12-07 19:14:21 -08:00
Antonio Sanchez	655c3a4042	Add specialization for compile-time zero-sized dense assignment. In the current `dense_assignment_loop` implementations, if the destination's inner or outer size is zero at compile time and if the kernel involves a product, we currently get a compile error (#2080). This is triggered by attempting to multiply a non-existent row by a column (or vice-versa). To address this, we add a specialization for zero-sized assignments (`AllAtOnceTraversal`) which evaluates to a no-op. We also add a static check to ensure the size is in-fact zero. This now seems to be the only existing use of `AllAtOnceTraversal`. Fixes #2080.	2020-12-07 08:38:43 -08:00
Gael Guennebaud	83309068b4	bug #1680 : improve MSVC inlining by declaring many triavial constructors and accessors as STRONG_INLINE.	2019-02-15 16:35:35 +01:00
Gael Guennebaud	6a510fe69c	Make MaxPacketSize a true upper bound, even for fixed-size inputs	2018-11-16 11:25:32 +01:00
Mark D Ryan	670d56441c	PR 544: Set requestedAlignment correctly for SliceVectorizedTraversals Commit aa110e681b8b2237757a652ba47da49e1fbd2cd6 optimised the multiplication of small dyanmically sized matrices by restricting the packet size to a maximum of 4, increasing the chances that SIMD instructions are used in the computation. However, it introduced a mismatch between the packet size and the requestedAlignment. This mismatch can lead to crashes when the destination is not aligned. This patch fixes the issue by ensuring that the AssignmentTraits are correctly computed when using a restricted packet size. * * * Bind LinearPacketType to MaxPacketSize This commit applies any packet size limit specified when instantiating copy_using_evaluator_traits to the LinearPacketType, providing that the size of the destination is not known at compile time. * * * Add unit test for restricted packet assignment A new unit test is added to check that multiplication of small dynamically sized matrices works correctly when the packet size is restricted to 4 and the destination is unaligned.	2018-11-13 16:15:08 +01:00
Gael Guennebaud	7fddc6a51f	typo	2018-11-14 14:43:18 +01:00
Mark D Ryan	aa110e681b	PR 526: Speed up multiplication of small, dynamically sized matrices The Packet16f, Packet8f and Packet8d types are too large to use with dynamically sized matrices typically processed by the SliceVectorizedTraversal specialization of the dense_assignment_loop. Using these types is likely to lead to little or no vectorization. Significant slowdown in the multiplication of these small matrices can be observed when building with AVX and AVX512 enabled. This patch introduces a new dense_assignment_kernel that is used when computing small products whose operands have dynamic dimensions. It ensures that the PacketSize used is no larger than 4, thereby increasing the chance that vectorized instructions will be used when computing the product. I tested all 969 possible combinations of M, K, and N that are handled by the dense_assignment_loop on x86 builds. Although a few combinations are slowed down by this patch they are far outnumbered by the cases that are sped up, as the following results demonstrate. Disabling Packed8d on AVX512 builds: Total Cases: 969 Better: 511 Worse: 85 Same: 373 Max Improvement: 169.00% (4 8 6) Max Degradation: 36.50% (8 5 3) Median Improvement: 35.46% Median Degradation: 17.41% Total FLOPs Improvement: 19.42% Disabling Packet16f and Packed8f on AVX512 builds: Total Cases: 969 Better: 658 Worse: 5 Same: 306 Max Improvement: 214.05% (8 6 5) Max Degradation: 22.26% (16 2 1) Median Improvement: 60.05% Median Degradation: 13.32% Total FLOPs Improvement: 59.58% Disabling Packed8f on AVX builds: Total Cases: 969 Better: 663 Worse: 96 Same: 210 Max Improvement: 155.29% (4 10 5) Max Degradation: 35.12% (8 3 2) Median Improvement: 34.28% Median Degradation: 15.05% Total FLOPs Improvement: 26.02%	2018-10-12 15:20:21 +02:00
Gael Guennebaud	371068992a	Add more debug output	2018-09-21 14:32:39 +02:00
luz.paz	e3912f5e63	MIsc. source and comment typos Found using `codespell` and `grep` from downstream FreeCAD	2018-03-11 10:01:44 -04:00
Gael Guennebaud	3587e481fb	silent MSVC warning	2017-11-27 21:53:02 +01:00
Gael Guennebaud	b651ce0ffa	Fix a gcc7 warning: Wint-in-bool-context	2017-06-26 09:58:28 +02:00
Alexander Neumann	dd58462e63	fixed inlining issue with clang-cl on visual studio (grafted from 7962ac1a5855e8b7a60d5d90e61365b71f5501a5 )	2017-02-08 23:50:38 +01:00
Gael Guennebaud	4a351be163	Fix warning	2017-01-27 11:59:35 +01:00
Gael Guennebaud	ba3f977946	bug #1376 : add missing assertion on size mismatch with compound assignment operators (e.g., mat += mat.col(j))	2017-01-23 22:06:08 +01:00
Gael Guennebaud	0db6d5b3f4	bug #1356 : fix calls to evaluator::coeffRef(0,0) to get the address of the destination by adding a dstDataPtr() member to the kernel. This fixes undefined behavior if dst is empty (nullptr).	2016-12-05 15:08:09 +01:00
Gael Guennebaud	465ede0f20	Fix compilation issue in mat = permutation (regression introduced in 8193ffb3d38b56c9295f204dc57dc6bac74f58aa )	2016-11-20 09:41:37 +01:00
Gael Guennebaud	8193ffb3d3	bug #1343 : fix compilation regression in mat+=selfadjoint_view. Generic EigenBase2EigenBase assignment was incomplete.	2016-11-18 10:17:34 +01:00
Gael Guennebaud	3ecb343dc3	Fix regression in X = (X*X.transpose())/s with X rectangular by deferring resizing of the destination after the creation of the evaluator of the source expression.	2016-10-26 22:50:41 +02:00
Robert Lukierski	a94791b69a	Fixes for min and abs after Benoit's comments, switched to numext.	2016-10-13 15:00:22 +01:00
Robert Lukierski	471075f7ad	Fixes min() warnings.	2016-10-12 18:59:05 +01:00
Robert Lukierski	86711497c4	Adding EIGEN_DEVICE_FUNC in the Geometry module. Additional CUDA necessary fixes in the Core (mostly usage of EIGEN_USING_STD_MATH).	2016-10-12 16:35:17 +01:00
Gael Guennebaud	4057f9b1fc	Enable slice-vectorization+inner-unrolling when unaligned vectorization is allowed. For instance, this permits to vectorize 5x5 matrices (including product)	2016-07-28 13:47:33 +02:00
Gael Guennebaud	66917299a9	Add debug output	2016-07-06 22:27:15 +02:00
Gael Guennebaud	367ef66af3	Re-enable some specializations for Assignment<.,Product<>>	2016-07-05 22:58:14 +02:00
Gael Guennebaud	91b3039013	Change the semantic of the last template parameter of Assignment from "Scalar" to "SFINAE" only. The previous "Scalar" semantic was obsolete since we allow for different scalar types in the source and destination expressions. On can still specialize on scalar types through SFINAE and/or assignment functor.	2016-07-04 11:02:00 +02:00
Gael Guennebaud	101ea26f5e	Include the cost of stores in unrolling (also fix infinite unrolling with expression costing 0 like Constant)	2016-06-15 00:01:16 +02:00
Gael Guennebaud	66e99ab6a1	Relax mixing-type constraints for binary coefficient-wise operators: - Replace internal::scalar_product_traits<A,B> by Eigen::ScalarBinaryOpTraits<A,B,OP> - Remove the "functor_is_product_like" helper (was pretty ugly) - Currently, OP is not used, but it is available to the user for fine grained tuning - Currently, only the following operators have been generalized: ,/,+,-,=,=,/=,+=,-= - TODO: generalize all other binray operators (comparisons,pow,etc.) - TODO: handle "scalar op array" operators (currently only * is handled) - TODO: move the handling of the "void" scalar type to ScalarBinaryOpTraits	2016-06-06 15:11:41 +02:00
Gael Guennebaud	37197b602b	Remove debuging code.	2016-05-26 11:53:10 +02:00
Gael Guennebaud	27f0434233	Introduce internal's UIntPtr and IntPtr types for pointer to integer conversions. This fixes "conversion from pointer to same-sized integral type" warnings by ICC. Ideally, we would use the std::[u]intptr_t types all the time, but since they are C99/C++11 only, let's be safe.	2016-05-26 10:52:12 +02:00
Gael Guennebaud	e68e165a23	bug #256 : enable vectorization with unaligned loads/stores. This concerns all architectures and all sizes. This new behavior can be disabled by defining EIGEN_UNALIGNED_VECTORIZE=0	2016-05-24 21:54:03 +02:00
Gael Guennebaud	64bb7576eb	Clean propagation of Dest/Src alignments.	2016-05-24 17:12:12 +02:00
Christoph Hertzberg	33ca7e3c8d	bug #1207 : Add and fix logical-op warnings	2016-05-11 19:36:34 +02:00
Gael Guennebaud	b1bd53aa6b	Fix performance regression: with AVX, unaligned stores were emitted instead of aligned ones for fixed size assignement.	2016-05-01 23:25:06 +02:00
Gael Guennebaud	06447e0a39	Improve half-packet vectorization logic to distinguish linear versus inner traversal modes.	2016-04-13 18:15:49 +02:00
Gael Guennebaud	675e0a2224	Fix static/inline keywords order.	2016-04-11 15:06:20 +02:00
Benoit Steiner	bff8cbad06	Removed executable bit from header files	2016-03-23 16:14:23 -07:00
Gael Guennebaud	a4c76f8d34	Improve inlining	2016-02-08 11:33:02 +01:00
Gael Guennebaud	d142165942	bug #667 : declare several critical functions as FORECE_INLINE to make ICC happier. <g.gael@free.fr> HG: branch 'default' HG: changed Eigen/src/Core/ArrayBase.h HG: changed Eigen/src/Core/AssignEvaluator.h HG: changed Eigen/src/Core/CoreEvaluators.h HG: changed Eigen/src/Core/CwiseUnaryOp.h HG: changed Eigen/src/Core/DenseBase.h HG: changed Eigen/src/Core/MatrixBase.h	2016-01-31 16:34:10 +01:00
Gael Guennebaud	8b9dc9f0df	bug #1144 : fix regression in x=y+A*x (aliasing), and move evaluator_traits::AssumeAliasing to evaluator_assume_aliasing.	2016-01-09 08:30:38 +01:00
Gael Guennebaud	30b5c4cd14	Remove useless "explicit", and fix inline/static order.	2015-12-11 10:59:39 +01:00
Gael Guennebaud	8531304858	Simplify cost computations based on HugeCost being smaller that unrolling limit	2015-10-28 13:39:02 +01:00
Gael Guennebaud	77ff3386b7	Refactoring of the cost model: - Dynamic is now an invalid value - introduce a HugeCost constant to be used for runtime-cost values or arbitrarily huge cost - add sanity checks for cost values: must be >=0 and not too large This change provides several benefits: - it fixes shortcoming is some cost computation where the Dynamic case was not properly handled. - it simplifies cost computation logic, and should avoid future similar shortcomings. - it allows to distinguish between different level of dynamic/huge/infinite cost - it should enable further simplifications in the computation of costs (save compilation time)	2015-10-28 11:42:14 +01:00
Gael Guennebaud	12f50a4697	Fix assign vectorization logic with respect to fixed outer-stride	2015-10-27 11:04:19 +01:00
Gael Guennebaud	0fc8954282	Improve readibility of EIGEN_DEBUG_ASSIGN mode.	2015-10-27 10:38:49 +01:00
Gael Guennebaud	5cc7251188	Some cleaning in evaluators	2015-10-08 15:22:04 +02:00
Gael Guennebaud	aba1eda71e	Help clang to inline some functions, thus fixing some regressions	2015-10-07 15:44:12 +02:00
Gael Guennebaud	41cc1f9033	Remove debuging prod() and lazyprod() function, plus some cleaning in noalias assignment	2015-10-07 15:41:22 +02:00
Gael Guennebaud	26cde4db3c	Define Permutation*<>::Scalar to 'void', re-enable scalar type compatibility check in assignment while relaxing this test for void types.	2015-10-06 17:18:06 +02:00
Gael Guennebaud	fb51bab272	Some cleaning	2015-10-06 17:14:56 +02:00
Gael Guennebaud	92b9f0e102	Cleaning pass on evaluators: remove the useless and error prone evaluator<>::type indirection.	2015-09-02 21:38:40 +02:00

1 2 3

117 Commits