eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-06-04 18:54:00 +08:00

Author	SHA1	Message	Date
Gael Guennebaud	4fa40367e9	* Big change in Block and Map: - added a MapBase base xpr on top of which Map and the specialization of Block are implemented - MapBase forces both aligned loads (and aligned stores, see below) in expressions such as "x.block(...) += other_expr" * Significant vectorization improvement: - added a AlignedBit flag meaning the first coeff/packet is aligned, this allows to not generate extra code to deal with the first unaligned part - removed all unaligned stores when no unrolling - removed unaligned loads in Sum when the input as the DirectAccessBit flag * Some code simplification in CacheFriendly product * Some minor documentation improvements	2008-08-09 18:41:24 +00:00
Benoit Jacob	c94be35bc8	introduce copyCoeff and copyPacket methods in MatrixBase, used by Assign, in preparation for new Swap impl reusing Assign code. remove last remnant of old Inverse class in Transform.	2008-08-05 18:00:23 +00:00
Gael Guennebaud	842c4f8bfa	Several compilation fixes for MSVC and NVCC, basically: - added explicit enum to int conversion where needed - if a function is not defined as declared and the return type is "tricky" then the type must be typedefined somewhere. A "tricky return type" can be: * a template class with a default parameter which depends on another template parameter * a nested template class, or type of a nested template class	2008-07-29 16:33:07 +00:00
Gael Guennebaud	b7bd1b3446	Add a very efficient evaluation path for both col-major matrix * vector and vector * row-major products. Currently, it is enabled only is the matrix has DirectAccessBit flag and the product is "large enough". Added the respective unit tests in test/product/cpp.	2008-07-12 12:12:02 +00:00
Benoit Jacob	2b53fd4d53	some performance fixes in Assign.h reported by Gael. Some doc update in Cwise.	2008-07-10 16:15:55 +00:00
Benoit Jacob	a9d319d44f	* do the ActualPacketAccesBit change as discussed on list * add comment in Product.h about CanVectorizeInner * fix typo in test/product.cpp	2008-07-04 12:43:55 +00:00
Gael Guennebaud	027818d739	* added innerSize / outerSize functions to MatrixBase * added complete implementation of sparse matrix product (with a little glue in Eigen/Core) * added an exhaustive bench of sparse products including GMM++ and MTL4 => Eigen outperforms in all transposed/density configurations !	2008-06-28 23:07:14 +00:00
Benoit Jacob	e27b2b95cf	* rework Map, allow vectorization * rework PacketMath and DummyPacketMath, make these actual template specializations instead of just overriding by non-template inline functions * introduce ei_ploadt and ei_pstoret, make use of them in Map and Matrix * remove Matrix::map() methods, use Map constructors instead.	2008-06-27 01:22:35 +00:00
Benoit Jacob	25ba9f377c	* add bench/benchVecAdd.cpp by Gael, fix crash (ei_pload on non-aligned) * introduce packet(int), make use of it in linear vectorized paths --> completely fixes the slowdown noticed in benchVecAdd. * generalize coeff(int) to linear-access xprs * clarify the access flag bits * rework api dox in Coeffs.h and util/Constants.h * improve certain expressions's flags, allowing more vectorization * fix bug in Block: start(int) and end(int) returned dyndyn size fix bug in Block: just because the Eval type has packet access doesn't imply the block xpr should have it too.	2008-06-26 16:06:41 +00:00
Gael Guennebaud	ac9aa47bbc	optimize linear vectorization both in Assign and Sum (optimal amortized perf)	2008-06-23 15:50:28 +00:00
Benoit Jacob	dc9206cec5	split sum away from redux and vectorize it. (could come back to redux after it has been vectorized, and could serve as a starting point for that) also make the abs2 functor vectorizable (for real types).	2008-06-23 10:32:48 +00:00
Benoit Jacob	8a967fb17c	* implement slice vectorization. Because it uses unaligned packet access, it is not certain that it will bring a performance improvement: benchmarking needed. * improve logic choosing slice vectorization. * fix typo in SSE packet math, causing crash in unaligned case. * fix bug in Product, causing crash in unaligned case. * add TEST_SSE3 CMake option.	2008-06-22 15:02:05 +00:00
Gael Guennebaud	e735692e37	move "enum" back to "const int" int ei_assign_impl: in fact, casting enums to int is enough to get compile time constants with ICC.	2008-06-20 07:10:50 +00:00
Gael Guennebaud	fb4a151982	* more cleaning in Product * make Matrix2f (and similar) vectorized using linear path * fix a couple of warnings and compilation issues with ICC and gcc 3.3/3.4 (cannot get Transform compiles with gcc 3.3/3.4, see the FIXME)	2008-06-19 23:00:51 +00:00
Benoit Jacob	bb1f4e44f1	* Block: row and column expressions in the inner direction now have the Like1D flag. * Big renaming: packetCoeff ---> packet VectorizableBit ---> PacketAccessBit Like1DArrayBit ---> LinearAccessBit	2008-06-16 14:54:31 +00:00
Benoit Jacob	9857764ae7	aaargh.	2008-06-16 11:20:29 +00:00
Benoit Jacob	478bfaf228	fix bug in computation of unrolling limit: div instead of mul	2008-06-16 11:18:59 +00:00
Benoit Jacob	c905b31b42	* Big rework of Assign.h: Much better organization Fix a few bugs Add the ability to unroll only the inner loop Add an unrolled path to the Like1D vectorization. Not well tested. ** Add placeholder for sliced vectorization. Unimplemented. * Rework of corrected_flags: improve rules determining vectorizability for vectors, the storage-order is indifferent, so we tweak it to allow vectorization of row-vectors. * fix compilation in benchmark, and a warning in Transpose.	2008-06-16 10:49:44 +00:00
Gael Guennebaud	6998037930	* move some compile time "if" to their respective unroller (assign and dot) * fix a couple of compilation issues when unrolling is disabled * reduce default unrolling limit to a more reasonable value	2008-06-07 01:07:48 +00:00
Gael Guennebaud	48262b9734	added a static assertion mechanism (see notes in Core/util/StaticAssert.h for details)	2008-06-04 11:16:11 +00:00
Gael Guennebaud	f5e599e489	* replace compile-time-if by meta-selector in Assign.h as it speed up compilation. * fix minor typo introduced in the previous commit	2008-05-31 14:42:07 +00:00
Gael Guennebaud	c1559d3079	* updated the assignement operator macro so that overloads in MatrixBase work * removed product_selector and cleaned Product.h a bit * cleaned Assign.h a bit	2008-05-28 22:56:19 +00:00
Gael Guennebaud	8711e26c8a	* change Flagged to take into account NestByValue only * bugfix in Assign and cache friendly product (weird that worked before) * improved argument evaluation in Product	2008-05-28 22:11:47 +00:00
Benoit Jacob	953efdbfe7	- introduce Part and Extract classes, splitting and extending the former Triangular class - full meta-unrolling in Part - move inverseProduct() to MatrixBase - compilation fix in ProductWIP: introduce a meta-selector to only do direct access on types that support it. - phase out the old Product, remove the WIP_DIRTY stuff. - misc renaming and fixes	2008-05-27 05:47:30 +00:00
Gael Guennebaud	4317fad869	* Added several cast to int of the enums (needed for some compilers) * Fix a mistake in CwiseNullary. * Added a CoreDeclarions header that declares only the forward declarations and related basic stuffs.	2008-05-12 18:09:30 +00:00
Benoit Jacob	678f18fce4	put inline keywords everywhere appropriate. So we don't need anymore to pass -finline-limit=1000 to gcc to get good performance. By the way some cleanup.	2008-05-12 17:34:46 +00:00
Benoit Jacob	3562b01105	* Give Konstantinos a copyright line * Fix compilation of Inverse.h with vectorisation * Introduce EIGEN_GNUC_AT_LEAST(x,y) macro doing future-proof (e.g. gcc v5.0) check * Only use ProductWIP if vectorisation is enabled * rename EIGEN_ALWAYS_INLINE -> EIGEN_INLINE with fall-back to inline keyword * some cleanup/indentation	2008-05-12 08:12:40 +00:00
Gael Guennebaud	46fa4c713f	* Started support for unaligned vectorization. * Introduce a new highly optimized matrix-matrix product for large matrices. The code is still highly experimental and it is activated only if you define EIGEN_WIP_PRODUCT at compile time. Currently the third dimension of the product must be a factor of the packet size (x4 for floats) and the right handed side matrix must be column major. Moreover, currently c = ab; actually computes c += ab !! Therefore, the code is provided for experimentation purpose only ! These limitations will be fixed soon or later to become the default product implementation.	2008-05-05 10:23:29 +00:00
Benoit Jacob	890a8de962	Make products always eval into expressions. Improves performance in benchmark. Still not as fasts as explicit eval(), strangely.	2008-05-02 08:53:23 +00:00
Gael Guennebaud	02f1615d2a	Enable vectorization of product with dynamic matrices, extended cache optimal product to work in any row/column major situations, and a few bugfixes (forgot to add the Cholesky header, vectorization of CwiseBinary)	2008-05-01 13:53:05 +00:00
Gael Guennebaud	1ec2d21ca5	Fixed a couple of issues introduced in previous commits. Added a test for Triangular.	2008-04-26 20:28:27 +00:00
Gael Guennebaud	b4c974d059	Added triangular assignement, e.g.: m.upper() = a+b; only updates the upper triangular part of m. Note that: m = (a+b).upper(); updates all coefficients of m (but half of the additions will be skiped) Updated back/forward substitution to better use Eigen's capability.	2008-04-26 19:20:26 +00:00
Gael Guennebaud	6f2c72fb53	Various fixes in: - vector to vector assign - PartialRedux - Vectorization criteria of Product - returned type of normalized - SSE integer mul	2008-04-25 23:10:37 +00:00
Gael Guennebaud	a451835bce	Make the explicit vectorization much more flexible: - support dynamic sizes - support arbitrary matrix size when the matrix can be seen as a 1D array (except for fixed size matrices where the size in Bytes must be a factor of 16, this is to allow compact storage of a vector of matrices) Note that the explict vectorization is still experimental and far to be completely tested.	2008-04-25 15:46:18 +00:00
Benoit Jacob	6ae037dfb5	give up on OpenMP... for now	2008-04-18 07:57:46 +00:00
Benoit Jacob	ea3ccb1e8c	* Start of the LU module, with matrix inversion already there and fully optimized. * Even if LargeBit is set, only parallelize for large enough objects (controlled by EIGEN_PARALLELIZATION_TRESHOLD).	2008-04-14 08:20:24 +00:00
Benoit Jacob	dcebc46cdc	- cleaner use of OpenMP (no code duplication anymore) using a macro and _Pragma. - use OpenMP also in cacheOptimalProduct and in the vectorized paths as well - kill the vector assignment unroller. implement in operator= the logic for assigning a row-vector in a col-vector. - CMakeLists support for building tests/examples with -fopenmp and/or -msse2 - updates in bench/, especially replace identity() by ones() which prevents underflows from perturbing bench results.	2008-04-11 14:28:42 +00:00
Benoit Jacob	7bee90a62a	Merge Gael's experimental OpenMP parallelization support into Assign.h.	2008-04-11 08:18:47 +00:00
Benoit Jacob	9d8876ce82	* rename XprCopy -> Nested * rename OperatorEquals -> Assign * move Util.h and FwDecl.h to a util/ subdir	2008-04-10 09:01:28 +00:00

1 2

89 Commits