Benoit Steiner
16047c8d4a
Pulled in the latest changes from the Eigen trunk
2014-08-13 22:25:29 -07:00
Gael Guennebaud
b47ef1431f
Fix many long to int implicit conversions
2014-07-08 16:47:11 +02:00
Chen-Pang He
b9ee880f07
chmod -x Eigen/src/Core/GenericPacketMath.h
2014-07-07 21:28:00 +08:00
Roger Martin
eb49100de9
Add component-wise atan() function (see bug #80 ).
2014-06-19 14:55:14 +01:00
Benoit Steiner
29aebf96e6
Created the pblend packet primitive and implemented it using SSE and AVX instructions.
2014-06-06 20:18:44 -07:00
Gael Guennebaud
3d8d0f6269
Enable vectorization of pack_rhs with a column-major RHS.
...
Rename and generalize Kernel<*> to PacketBlock<*,N>.
2014-04-25 10:56:18 +02:00
Gael Guennebaud
d5a795f673
New gebp kernel handling up to 3 packets x 4 register-level blocks. Huge speeup on Haswell.
...
This changeset also introduce new vector functions: ploadquad and predux4.
2014-04-16 17:05:11 +02:00
Gael Guennebaud
10aa14592a
Add a mechanism to recursively access to half-size packet types
2014-03-28 10:18:04 +01:00
Benoit Steiner
8a94cb3edd
Implemented the SSE version of the gather and scatter packet primitives.
2014-03-27 18:29:01 -07:00
Benoit Steiner
ee86679096
Introduced pscatter/pgather packet primitives. They will be used to optimize the loop peeling code of the block-panel matrix multiplication kernel.
2014-03-27 16:03:03 -07:00
Benoit Steiner
a419cea4a0
Created the ptranspose packet primitive that can transpose an array of N packets, where N is the number of words in each packet. This primitive will be used to complete the vectorization of the gemm_pack_lhs and gemm_pack_rhs functions.
...
Implemented the primitive using SSE instructions.
2014-03-26 19:03:07 -07:00
Gael Guennebaud
b286a1e75c
add pbroadcast2/4 generic intrinsics
2014-03-26 16:46:36 +01:00
Gael Guennebaud
01fd880424
Revert previous change and introduce a new workaround regarding gcc generating a shufps instruction instead of the more efficient pshufd instruction.
...
The trick consists in introducing a new pload1 function to be used in low level product kernels for which bug #203 does not apply.
Indeed, it turned out that using inline assembly prevents gcc of doing a good job at instructtion reordering.
2014-03-20 16:03:46 +01:00
Gael Guennebaud
2f593ee67c
merge with main branch
2013-07-17 13:21:35 +02:00
Gael Guennebaud
155fa0ca83
Add missing namespace prefix in pconj
2013-07-03 11:36:12 +02:00
Gael Guennebaud
64054ee396
Add nvcc support for normalize, initializers, and fuzzy comparisons
2013-06-05 15:38:33 +02:00
Gael Guennebaud
9cd2d14005
merge with default branch
2013-04-19 11:21:39 +02:00
Gael Guennebaud
9a4caf2b0f
Extend internal doc of ploaddup and palign
2013-04-17 09:17:34 +02:00
Gael Guennebaud
12439e1249
Port SelfCwiseBinaryOp and Dot.h to nvcc, fix portability issue with std::min/max
2013-04-05 16:35:49 +02:00
Gael Guennebaud
a76fbbf397
Fix bug #314 :
...
- remove most of the metaprogramming kung fu in MathFunctions.h (only keep functions that differs from the std)
- remove the overloads for array expression that were in the std namespace
2012-11-06 15:25:50 +01:00
Benoit Jacob
69124cfca2
Automatic relicensing to MPL2 using Keirs script. Manual fixup follows.
2012-07-13 14:42:47 -04:00
Jitse Niesen
3c412183b2
Get rid of include directives inside namespace blocks (bug #339 ).
2012-04-15 11:06:28 +01:00
Gael Guennebaud
9c86ee2695
fix static inline versus inline static issues (the former is the correct order)
2012-01-31 12:58:52 +01:00
Gael Guennebaud
22bff949c8
protect calls to min and max with parentheses to make Eigen compatible with default windows.h
...
(transplanted from 49b6e9143e1d74441924c0c313536e263e12a55c
)
2011-07-21 11:19:36 +02:00
Gael Guennebaud
87ac09daa8
Simplify the use of custom scalar types, the rule is to never directly call a standard math function using std:: but rather put a using std::foo before and simply call foo:
...
using std::max;
max(a,b);
2011-05-25 08:41:45 +02:00
Jitse Niesen
a96c849c20
Document enums in Constants.h (bug #248 ).
...
To get the links to work, I also had to document the Eigen namespace.
Unfortunately, this means that the word Eigen is linked whenever it appears
in the docs.
2011-05-03 17:08:14 +01:00
Gael Guennebaud
c8e1b679fa
re-enable fast pset1-pstore by introducing a new higher level pstore1 function
2011-03-02 10:55:44 +01:00
Gael Guennebaud
6e01780541
fix a couple of issues with pcplxflip
2011-02-23 17:51:40 +03:00
Gael Guennebaud
aea630a98a
factorize implementation of standard real unary math functions, and add acos, asin
2011-02-17 17:37:11 +01:00
Jason Newton
d028262e06
add tan function in Array world
2011-02-03 14:34:40 +01:00
Benoit Jacob
4716040703
bug #86 : use internal:: namespace instead of ei_ prefix
2010-10-25 10:15:22 -04:00
Gael Guennebaud
cd0e5dca9b
wip: extend the gebp kernel to optimize complex and mixed products
2010-07-19 08:50:59 +02:00
Gael Guennebaud
ff96c94043
mixing types in product step 2:
...
* pload* and pset1 are now templated on the packet type
* gemv routines are now embeded into a structure with
a consistent API with respect to gemm
* some configurations of vector * matrix and matrix * matrix works fine,
some need more work...
2010-07-11 15:48:30 +02:00
Gael Guennebaud
300a226ffa
scalars fitting in a single packet requires more work, step 1
...
* add a, Alignable trait
* update LinearVectorization assignment
2010-07-08 14:27:47 +02:00
Gael Guennebaud
b0896382a3
s/IsVectorized/Vectorizable
2010-07-07 11:10:46 +02:00
Gael Guennebaud
bfa606d16f
* add a IsVectorized mechanism (instead of packet-size>1...)
...
* vectorize complex<double>
2010-07-06 23:36:00 +02:00
Gael Guennebaud
c69a226192
* extend the Has* packet traits and makes all functor use it
...
* extend the packing routines to support conjugation
2010-07-05 23:27:54 +02:00
Gael Guennebaud
28e64b0da3
email change
2010-06-24 23:21:58 +02:00
Benoit Jacob
f0a6d56f07
fix linking errors with multiply defined functions
2010-06-18 09:01:34 -04:00
Benoit Jacob
134ca4acb3
packet math functions:
...
- take const Packet& args like the other packet funcs
- SSE specializations: make them be actual template specializations
2010-06-15 08:29:21 -04:00
Konstantinos Margaritis
9337f371d2
(proper commit this time)
...
replaced _mm_prefetch in GeneralBlockPanelKernel.h, with ei_prefetch() inline function.
Implemented NEON and AltiVec versions, copied SSE version over from GeneralBlockPanelKernel.h.
Also in GCC case (or rather !_MSC_VER) it's implemented using __builtin_prefetch().
NEON managed to give a small but welcome boost, 0.88GFLOPS -> 0.91GFLOPS.
2010-04-24 00:58:44 +03:00
Konstantinos Margaritis
5acf46bd12
Backed out changeset 6972c140f737874d88da0e225c7c27b4563a4518
2010-04-24 00:57:10 +03:00
oem
6972c140f7
replaced _mm_prefetch in GeneralBlockPanelKernel.h, with ei_prefetch() inline function.
...
Implemented NEON and AltiVec versions, copied SSE version over from GeneralBlockPanelKernel.h.
Also in GCC case (or rather !_MSC_VER) it's implemented using __builtin_prefetch().
NEON managed to give a small but welcome boost, 0.88GFLOPS -> 0.91GFLOPS.
2010-04-24 00:44:14 +03:00
Hauke Heibel
4365a48748
Added an ei_linspaced_op to create linearly spaced vectors.
...
Added setLinSpaced/LinSpaced functionality to DenseBase.
Improved vectorized assignment - overcomes MSVC optimization issues.
CwiseNullaryOp is now requiring functors to offer 1D and 2D operators.
Adapted existing functors to the new CwiseNullaryOp requirements.
Added ei_plset to create packages as [a, a+1, ..., a+size].
Added more nullaray unit tests.
2010-01-26 19:42:17 +01:00
Gael Guennebaud
eb8f450071
Hey, finally the copyCoeff stuff is not only used to implement swap anymore :)
...
Add an internal pseudo expression allowing to optimize operators like +=, *= using
the copyCoeff stuff.
This allows to easily enforce aligned load for the destination matrix everywhere.
2009-11-20 15:39:38 +01:00
Gael Guennebaud
1a1b2e9f27
finally directly calling the low-level products is faster
2009-07-10 10:41:26 +02:00
Benoit Jacob
6347b1db5b
remove sentence "Eigen itself is part of the KDE project."
...
it never made very precise sense. but now does it still make any?
2009-05-22 20:25:33 +02:00
Gael Guennebaud
adf5104bae
oops, bad copy paste in ei_psqrt, thanks to Jitse Niesen
2009-04-02 06:29:05 +00:00
Gael Guennebaud
e9f6167485
make special ei_p functions static to avoid linking issues (they are too
...
complex to be inlined)
2009-03-27 14:55:46 +00:00
Gael Guennebaud
49fc1e3e84
add vectorization of sqrt for float
2009-03-27 14:41:46 +00:00