eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-07-21 04:14:26 +08:00

Author	SHA1	Message	Date
Gael Guennebaud	440664cd5d	temporary fix of the pèrevious commit	2008-08-24 15:27:05 +00:00
Gael Guennebaud	ba100998bf	* split Meta.h to Meta.h (generic meta programming) and XprHelper.h (relates to eigen mechanism) * added a meta.cpp unit test * EIGEN_TUNE_FOR_L2_CACHE_SIZE now represents L2 block size in Bytes (whence the ei_meta_sqrt...) * added a CustomizeEigen.dox page * added a TOC to QuickStartGuide.dox	2008-08-24 15:15:32 +00:00
Gael Guennebaud	f2f48b6560	* remove LargeBit and related stuff * replaced the Flags template parameter of Matrix by StorageOrder and move it back to the 4th position such that we don't have to worry about the two Max* template parameters * extended EIGEN_USING_MATRIX_TYPEDEFS with the ei_* math functions	2008-08-23 17:11:44 +00:00
Gael Guennebaud	a6d387a359	Various compilation fixes for MSVC 9. All tests compile but some still fail at runtime in ei_aligned_free() (even without vectorization).	2008-08-19 11:06:40 +00:00
Gael Guennebaud	4fa40367e9	* Big change in Block and Map: - added a MapBase base xpr on top of which Map and the specialization of Block are implemented - MapBase forces both aligned loads (and aligned stores, see below) in expressions such as "x.block(...) += other_expr" * Significant vectorization improvement: - added a AlignedBit flag meaning the first coeff/packet is aligned, this allows to not generate extra code to deal with the first unaligned part - removed all unaligned stores when no unrolling - removed unaligned loads in Sum when the input as the DirectAccessBit flag * Some code simplification in CacheFriendly product * Some minor documentation improvements	2008-08-09 18:41:24 +00:00
Gael Guennebaud	842c4f8bfa	Several compilation fixes for MSVC and NVCC, basically: - added explicit enum to int conversion where needed - if a function is not defined as declared and the return type is "tricky" then the type must be typedefined somewhere. A "tricky return type" can be: * a template class with a default parameter which depends on another template parameter * a nested template class, or type of a nested template class	2008-07-29 16:33:07 +00:00
Benoit Jacob	f5791eeb70	the big Array/Cwise rework as discussed on the mailing list. The new API can be seen in Eigen/src/Core/Cwise.h.	2008-07-08 00:49:10 +00:00
Gael Guennebaud	027818d739	* added innerSize / outerSize functions to MatrixBase * added complete implementation of sparse matrix product (with a little glue in Eigen/Core) * added an exhaustive bench of sparse products including GMM++ and MTL4 => Eigen outperforms in all transposed/density configurations !	2008-06-28 23:07:14 +00:00
Benoit Jacob	e27b2b95cf	* rework Map, allow vectorization * rework PacketMath and DummyPacketMath, make these actual template specializations instead of just overriding by non-template inline functions * introduce ei_ploadt and ei_pstoret, make use of them in Map and Matrix * remove Matrix::map() methods, use Map constructors instead.	2008-06-27 01:22:35 +00:00
Gael Guennebaud	e5d301dc96	various work on the Sparse module: * added some glue to Eigen/Core (SparseBit, ei_eval, Matrix) * add two new sparse matrix types: HashMatrix: based on std::map (for random writes) LinkedVectorMatrix: array of linked vectors (for outer coherent writes, e.g. to transpose a matrix) * add a SparseSetter class to easily set/update any kind of matrices, e.g.: { SparseSetter<MatrixType,RandomAccessPattern> wrapper(mymatrix); for (...) wrapper->coeffRef(rand(),rand()) = rand(); } * automatic shallow copy for RValue * and a lot of mess ! plus: * remove the remaining ArrayBit related stuff * don't use alloca in product for very large memory allocation	2008-06-26 23:22:26 +00:00
Benoit Jacob	25ba9f377c	* add bench/benchVecAdd.cpp by Gael, fix crash (ei_pload on non-aligned) * introduce packet(int), make use of it in linear vectorized paths --> completely fixes the slowdown noticed in benchVecAdd. * generalize coeff(int) to linear-access xprs * clarify the access flag bits * rework api dox in Coeffs.h and util/Constants.h * improve certain expressions's flags, allowing more vectorization * fix bug in Block: start(int) and end(int) returned dyndyn size fix bug in Block: just because the Eval type has packet access doesn't imply the block xpr should have it too.	2008-06-26 16:06:41 +00:00
Gael Guennebaud	fb4a151982	* more cleaning in Product * make Matrix2f (and similar) vectorized using linear path * fix a couple of warnings and compilation issues with ICC and gcc 3.3/3.4 (cannot get Transform compiles with gcc 3.3/3.4, see the FIXME)	2008-06-19 23:00:51 +00:00
Gael Guennebaud	82c3cea1d5	* refactoring of Product: * use ProductReturnType<>::Type to get the correct Product xpr type * Product is no longer instanciated for xpr types which are evaluated * vectorization of "a.transpose() * b" for the normal product (small and fixed-size matrix) * some cleanning * removed ArrayBase	2008-06-19 17:33:57 +00:00
Gael Guennebaud	5dbfed1902	fix two bugs dicovered by the previous commit.	2008-06-16 16:39:58 +00:00
Benoit Jacob	bb1f4e44f1	* Block: row and column expressions in the inner direction now have the Like1D flag. * Big renaming: packetCoeff ---> packet VectorizableBit ---> PacketAccessBit Like1DArrayBit ---> LinearAccessBit	2008-06-16 14:54:31 +00:00
Benoit Jacob	c905b31b42	* Big rework of Assign.h: Much better organization Fix a few bugs Add the ability to unroll only the inner loop Add an unrolled path to the Like1D vectorization. Not well tested. ** Add placeholder for sliced vectorization. Unimplemented. * Rework of corrected_flags: improve rules determining vectorizability for vectors, the storage-order is indifferent, so we tweak it to allow vectorization of row-vectors. * fix compilation in benchmark, and a warning in Transpose.	2008-06-16 10:49:44 +00:00
Benoit Jacob	c90c77051f	* make the _Flags template parameter of Matrix default to the corrected flags. This ensures that unless explicitly messed up otherwise, a Matrix type is equal to its own Eval type. This seriously reduces the number of types instantiated. Measured +13% compile speed, -7% binary size. * Improve doc of Matrix template parameters.	2008-06-13 07:53:45 +00:00
Gael Guennebaud	48262b9734	added a static assertion mechanism (see notes in Core/util/StaticAssert.h for details)	2008-06-04 11:16:11 +00:00
Gael Guennebaud	fcf4457b78	added optimized matrix times diagonal matrix product via Diagonal flag shortcut.	2008-05-31 21:35:11 +00:00
Gael Guennebaud	e2ac5d244e	Added ArrayBit to get the ability to manipulate a Matrix like a simple scalar. In particular this flag changes the behavior of operator* to a coeff wise product.	2008-05-29 22:33:07 +00:00
Benoit Jacob	486fdb26a1	many small fixes and documentation improvements, this should be alpha5.	2008-05-29 03:12:30 +00:00
Benoit Jacob	f54760c889	hehe, the complicated nesting scheme in Flagged in the previous commit was a sign that we were doing something wrong. In fact, having NestByValue as a special case of Flagged was wrong, and the previous commit, while not buggy, was inefficient because then when the resulting NestByValue xpr was nested -- hence copied -- the original xpr which was already nested by value was copied again; hence instead of 1 copy we got 3 copies. The solution was to ressuscitate the old Temporary.h (renamed NestByValue.h) as it was the right approach.	2008-05-28 05:14:16 +00:00
Benoit Jacob	953efdbfe7	- introduce Part and Extract classes, splitting and extending the former Triangular class - full meta-unrolling in Part - move inverseProduct() to MatrixBase - compilation fix in ProductWIP: introduce a meta-selector to only do direct access on types that support it. - phase out the old Product, remove the WIP_DIRTY stuff. - misc renaming and fixes	2008-05-27 05:47:30 +00:00
Gael Guennebaud	94e1629a1b	* improved product performance: - fallback to normal product for small dynamic matrices - overloaded "c += (a * b).lazy()" to avoid the expensive and useless temporary and setZero() in such very common cases. * fix a couple of issues with the flags	2008-05-22 14:51:25 +00:00
Gael Guennebaud	c6789a279c	Fix compilation issues with MSVC and NVCC. Added a few typedef of complex return types in MatrixBase (Needed by MSVC)	2008-05-15 09:40:11 +00:00
Benoit Jacob	5da60897ab	Introduce generic Flagged xpr, remove already Lazy.h and Temporary.h Rename DefaultLostFlagMask --> HerediraryBits	2008-05-14 08:20:15 +00:00
Gael Guennebaud	4317fad869	* Added several cast to int of the enums (needed for some compilers) * Fix a mistake in CwiseNullary. * Added a CoreDeclarions header that declares only the forward declarations and related basic stuffs.	2008-05-12 18:09:30 +00:00
Gael Guennebaud	64c49de7ba	* split PacketMath.h to SSE and Altivec specific files * improved the flexibility of the new product implementation, now all sizes seems to be properly handled.	2008-05-05 17:19:47 +00:00
Gael Guennebaud	a451835bce	Make the explicit vectorization much more flexible: - support dynamic sizes - support arbitrary matrix size when the matrix can be seen as a 1D array (except for fixed size matrices where the size in Bytes must be a factor of 16, this is to allow compact storage of a vector of matrices) Note that the explict vectorization is still experimental and far to be completely tested.	2008-04-25 15:46:18 +00:00
Gael Guennebaud	9385793f71	Fix a couple of issue with the vectorization. In particular, default ei_p* functions are provided to handle not suported types seemlessly. Added a generic null-ary expression with null-ary functors. They replace Zero, Ones, Identity and Random.	2008-04-24 18:35:39 +00:00
Benoit Jacob	acfd6f3bda	- add _packetCoeff() to Inverse, allowing vectorization. - let Inverse take template parameter MatrixType instead of ExpressionType, in order to reduce executable code size when taking inverses of xpr's. - introduce ei_corrected_matrix_flags : the flags template parameter to the Matrix class is only a suggestion. This is also useful in ei_eval.	2008-04-16 07:18:27 +00:00
Benoit Jacob	9789c04467	when evaluating an xpr, the result can now be vectorizable even if the xpr itself wasn't vectorizable.	2008-04-14 08:55:12 +00:00
Benoit Jacob	ea3ccb1e8c	* Start of the LU module, with matrix inversion already there and fully optimized. * Even if LargeBit is set, only parallelize for large enough objects (controlled by EIGEN_PARALLELIZATION_TRESHOLD).	2008-04-14 08:20:24 +00:00
Benoit Jacob	ca448d2537	split those files in util/ some more renaming	2008-04-10 09:41:13 +00:00

34 Commits