Till Hoffmann
dd5d390daf
Added zeta function.
2016-04-01 13:32:29 +01:00
Eugene Brevdo
73220d2bb0
Resolve bad merge.
2016-03-08 17:28:21 -08:00
Benoit Steiner
f535378995
Added support for vectorized type casting of int to char.
2016-02-03 18:58:29 -08:00
Benoit Steiner
d3f533b395
Fixed compilation warning
2016-01-28 20:09:45 -08:00
Eugene Brevdo
f7362772e3
Add digamma for CPU + CUDA. Includes tests.
2015-12-24 21:15:38 -08:00
Benoit Steiner
e535450573
Cleanup
2015-12-08 14:06:39 -08:00
Benoit Steiner
73b68d4370
Fixed a couple of typos
...
Cleaned up the code a bit.
2015-12-07 16:38:48 -08:00
Eugene Brevdo
fa4f933c0f
Add special functions to Eigen: lgamma, erf, erfc.
...
Includes CUDA support and unit tests.
2015-12-07 15:24:49 -08:00
Mark Borgerding
7ddcf97da7
added scalar_sign_op (both real,complex)
2015-11-24 17:15:07 -05:00
Gael Guennebaud
6245591349
Fix prototype of plset and generalize linspace functor.
2015-08-07 19:27:59 +02:00
Gael Guennebaud
ce57dbd937
Let unpacket_traits<> exposes the required alignment and make use of it everywhere
2015-08-07 10:44:01 +02:00
Gael Guennebaud
1f5024332e
First part of a big refactoring of alignment control to enable the handling of arbitrarily aligned buffers. It includes:
...
- AlignedBit flag is deprecated. Alignment is now specified by the evaluator through the 'Alignment' enum, e.g., evaluator<Xpr>::Alignment. Its value is in Bytes.
- Add several enums to specify alignment: Aligned8, Aligned16, Aligned32, Aligned64, Aligned128. AlignedMax corresponds to EIGEN_MAX_ALIGN_BYTES. Such enums are used to define the above Alignment value, and as the 'Options' template parameter of Map<> and Ref<>.
- The Aligned enum is now deprecated. It is now an alias for Aligned16.
- Currently, traits<Matrix<>>, traits<Array<>>, traits<Ref<>>, traits<Map<>>, and traits<Block<>> also expose the Alignment enum.
2015-08-06 15:31:07 +02:00
Benoit Steiner
513e357b48
Added support for prefetching on cuda devices
2015-07-17 15:35:16 -07:00
Gael Guennebaud
25a98be948
bug #80 : merge with d_hood branch on adding more coefficient-wise unary array functors
2015-06-10 15:52:05 +02:00
Deanna Hood
f52b78491c
Remove packet isNaN, isInf, isFinite
2015-03-17 09:26:24 +10:00
Deanna Hood
fef4e071d7
Rename isinf to isInf
2015-03-17 05:58:47 +10:00
Deanna Hood
46cf9cda32
Add isfinite array support as isFinite
2015-03-17 04:33:12 +10:00
Deanna Hood
fb68b149cb
Rename isnan to isNaN
2015-03-17 02:04:42 +10:00
Deanna Hood
f89fcefa79
Add hyperbolic trigonometric functions from std array support
2015-03-11 13:13:30 +10:00
Deanna Hood
a5e49976f5
Add log10 array support
2015-03-11 08:56:42 +10:00
Deanna Hood
31fdd67756
Additional unary coeff-wise functors (isnan, round, arg, e.g.)
2015-03-11 06:39:23 +10:00
Benoit Steiner
fb53384b0f
Improved the default implementation of prsqrt
2015-02-28 01:51:26 -08:00
Benoit Steiner
306fceccbe
Pulled latest updates from trunk
2015-02-27 13:05:26 -08:00
Benoit Jacob
6466fa63be
Reimplement the selection between rotating and non-rotating kernels
...
using templates instead of macros and if()'s.
That was needed to fix the build of unit tests on ARM, which I had
broken. My bad for not testing earlier.
2015-02-27 15:30:10 -05:00
Benoit Steiner
573b377110
Added support for vectorized type casting of tensors
2015-02-27 08:46:04 -08:00
Benoit Jacob
b7fc8746e0
Replace a static assert by a runtime one, fixes the build of unit tests on ARM
...
Also safely assert in the non-implemented path that should never be taken in practice,
and would return wrong results.
2015-02-27 10:01:59 -05:00
Benoit Steiner
f41b1f1666
Added support for fast reciprocal square root computation.
2015-02-26 09:42:41 -08:00
Benoit Jacob
9bd8a4bab5
bug #955 - Implement a rotating kernel alternative in the 3px4 gebp path
...
This is substantially faster on ARM, where it's important to minimize the number of loads.
This is specific to the case where all packet types are of size 4. I made my best attempt to minimize how dirty this is... opinions welcome.
Eventually one could have a generic rotated kernel, but it would take some work to get there. Also, on sandy bridge, in my experience, it's not beneficial (even about 1% slower).
2015-02-18 15:03:35 -05:00
Gael Guennebaud
45cbb0bbb1
The usage of DenseIndex is deprecated, so let's replace DenseIndex by Index
2015-02-16 15:05:41 +01:00
Benoit Steiner
c739102ef9
Pulled the latest changes from the trunk
2015-02-06 05:25:03 -08:00
Gael Guennebaud
ee06f78679
Introduce unified macros to identify compiler, OS, and architecture. They are all defined in util/Macros.h and prefixed with EIGEN_COMP_, EIGEN_OS_, and EIGEN_ARCH_ respectively.
2014-11-04 21:58:52 +01:00
Gael Guennebaud
fe57b2f963
bug #701 : workaround (min) and (max) blocking ADL by introducing numext::mini and numext::maxi internal functions and a EIGEN_NOT_A_MACRO macro.
2014-10-20 15:55:32 +02:00
Benoit Steiner
99d75235a9
Misc improvements and cleanups
2014-10-13 17:02:09 -07:00
Benoit Steiner
5cc23199be
More tests to validate the const-correctness of the tensor code.
2014-10-02 10:30:44 -07:00
Benoit Steiner
16047c8d4a
Pulled in the latest changes from the Eigen trunk
2014-08-13 22:25:29 -07:00
Gael Guennebaud
b47ef1431f
Fix many long to int implicit conversions
2014-07-08 16:47:11 +02:00
Chen-Pang He
b9ee880f07
chmod -x Eigen/src/Core/GenericPacketMath.h
2014-07-07 21:28:00 +08:00
Roger Martin
eb49100de9
Add component-wise atan() function (see bug #80 ).
2014-06-19 14:55:14 +01:00
Benoit Steiner
29aebf96e6
Created the pblend packet primitive and implemented it using SSE and AVX instructions.
2014-06-06 20:18:44 -07:00
Gael Guennebaud
3d8d0f6269
Enable vectorization of pack_rhs with a column-major RHS.
...
Rename and generalize Kernel<*> to PacketBlock<*,N>.
2014-04-25 10:56:18 +02:00
Gael Guennebaud
d5a795f673
New gebp kernel handling up to 3 packets x 4 register-level blocks. Huge speeup on Haswell.
...
This changeset also introduce new vector functions: ploadquad and predux4.
2014-04-16 17:05:11 +02:00
Gael Guennebaud
10aa14592a
Add a mechanism to recursively access to half-size packet types
2014-03-28 10:18:04 +01:00
Benoit Steiner
8a94cb3edd
Implemented the SSE version of the gather and scatter packet primitives.
2014-03-27 18:29:01 -07:00
Benoit Steiner
ee86679096
Introduced pscatter/pgather packet primitives. They will be used to optimize the loop peeling code of the block-panel matrix multiplication kernel.
2014-03-27 16:03:03 -07:00
Benoit Steiner
a419cea4a0
Created the ptranspose packet primitive that can transpose an array of N packets, where N is the number of words in each packet. This primitive will be used to complete the vectorization of the gemm_pack_lhs and gemm_pack_rhs functions.
...
Implemented the primitive using SSE instructions.
2014-03-26 19:03:07 -07:00
Gael Guennebaud
b286a1e75c
add pbroadcast2/4 generic intrinsics
2014-03-26 16:46:36 +01:00
Gael Guennebaud
01fd880424
Revert previous change and introduce a new workaround regarding gcc generating a shufps instruction instead of the more efficient pshufd instruction.
...
The trick consists in introducing a new pload1 function to be used in low level product kernels for which bug #203 does not apply.
Indeed, it turned out that using inline assembly prevents gcc of doing a good job at instructtion reordering.
2014-03-20 16:03:46 +01:00
Gael Guennebaud
2f593ee67c
merge with main branch
2013-07-17 13:21:35 +02:00
Gael Guennebaud
155fa0ca83
Add missing namespace prefix in pconj
2013-07-03 11:36:12 +02:00
Gael Guennebaud
64054ee396
Add nvcc support for normalize, initializers, and fuzzy comparisons
2013-06-05 15:38:33 +02:00