Hauke Heibel
ee27d50633
Fixed template parameter.
2015-02-18 18:51:08 +01:00
Gael Guennebaud
73a24de424
merge
2015-02-18 15:51:00 +01:00
Gael Guennebaud
63eb0f6fe6
Clean a bit computeProductBlockingSizes (use Index type, remove CEIL macro)
2015-02-18 15:49:05 +01:00
Benoit Jacob
4a3e6c8be1
bug #958 - Allow testing specific blocking sizes
...
This is only a debugging/testing patch. It allows testing specific
product blocking sizes, typically to study the impact on performance.
Example usage:
int testk, testm, testn;
#define EIGEN_TEST_SPECIFIC_BLOCKING_SIZES
#define EIGEN_TEST_SPECIFIC_BLOCKING_SIZE_K testk
#define EIGEN_TEST_SPECIFIC_BLOCKING_SIZE_M testm
#define EIGEN_TEST_SPECIFIC_BLOCKING_SIZE_N testn
#include <Eigen/Core>
2015-02-18 09:43:55 -05:00
Gael Guennebaud
c7bb1e8ea8
Fix a regression when using OpenMP, and fix bug #714 : the number of threads might be lower than the number of requested ones
2015-02-18 15:19:23 +01:00
Benoit Jacob
2aa09e6b4e
Fix asm comments in 1px1 kernel
2015-03-03 13:44:00 -05:00
Benoit Jacob
eae8e27b7d
Add a benchmark-default-sizes action to benchmark-blocking-sizes.cpp
2015-03-03 11:41:21 -05:00
Marc Glisse
37a93c4263
New scoring functor to select the pivot.
...
This is can be useful for non-floating point scalars, where choosing the biggest element is generally not the best choice.
2015-03-03 17:08:28 +01:00
Benoit Jacob
ccc1277a42
must also disable complex<double> when disabling double vectorization
2015-03-03 10:17:05 -05:00
Benoit Jacob
f839099512
Work around an ICE in Clang 3.5 in the iOS toolchain with double NEON intrinsics.
2015-03-03 09:35:22 -05:00
Benoit Jacob
1ec0f4fadf
HalfPacket also needed to be disabled for double, on ARMv8.
2015-03-02 16:08:54 -05:00
Gael Guennebaud
9aee1e300a
Increase unit-test L1 cache size to ensure we are doing at least 2 peeled loop within product kernel.
2015-02-27 22:55:12 +01:00
Gael Guennebaud
b10cd3afd2
Re-enbale detection of min/max parentheses protection, and re-enable mpreal_support unit test.
2015-02-27 22:38:00 +01:00
Gael Guennebaud
548b781380
Fix bug #945 : workaround MSVC warning
2015-02-18 12:53:49 +01:00
Gael Guennebaud
6f4adc9e94
Add missing install directives for arch/CUDA
2015-02-18 11:40:06 +01:00
Gael Guennebaud
eb563049f7
Remove some dead stores.
2015-02-18 11:26:48 +01:00
Gael Guennebaud
20cac72b82
Packet must be passed by const reference and not by value to avoid alignment issue.
2015-02-17 22:58:32 +01:00
Gael Guennebaud
159fb181c2
Disable __m128* wrappers when compiling with AVX and -fabi-version=4
2015-02-17 16:27:20 +01:00
Gael Guennebaud
91ab2489dd
Fix compilation with GCC/AVX (workaround __m128 and __m256 being the same type with default ABI)
2015-02-17 16:08:07 +01:00
Gael Guennebaud
8768ff3c31
Add PermutationMatrix::determinant method.
2015-02-16 19:08:25 +01:00
Martin Drozdik
64b29e06b9
bug #956 : Fixed bug in move constructors of DenseStorage which caused "moved-from" objects to be in an invalid state.
2015-02-16 18:18:46 +09:00
Gael Guennebaud
98604576d1
Merged in chtz/eigen-indexconversion (pull request PR-92)
...
bug #877 , bug #572 : Get rid of Index conversion warnings, summary of changes:
- Introduce a global typedef Eigen::Index making Eigen::DenseIndex and AnyExpr<>::Index deprecated (default is std::ptrdiff_t).
- Eigen::Index is used throughout the API to represent indices, offsets, and sizes.
- Classes storing an array of indices uses the type StorageIndex to store them. This is a template parameter of the class. Default is int.
- Methods that *explicitly* set or return an element of such an array take or return a StorageIndex type. In all other cases, the Index type is used.
2015-02-16 15:29:00 +01:00
Gael Guennebaud
45cbb0bbb1
The usage of DenseIndex is deprecated, so let's replace DenseIndex by Index
2015-02-16 15:05:41 +01:00
Gael Guennebaud
cc641aabb7
Remove deprecated usage of expr::Index.
2015-02-16 14:46:51 +01:00
Gael Guennebaud
aa6c516ec1
Fix many long to int conversion warnings:
...
- fix usage of Index (API) versus StorageIndex (when multiple indexes are stored)
- use StorageIndex(val) when the input has already been check
- use internal::convert_index<StorageIndex>(val) when val is potentially unsafe (directly comes from user input)
2015-02-16 13:19:05 +01:00
Benoit Steiner
e2cfddf75f
Pulled latest updates from trunk
2015-02-13 16:21:59 -08:00
Benoit Steiner
0927801a84
Optimized version of the sin(), exp(), log() and sqrt() function for AVX
2015-02-13 16:07:08 -08:00
Benoit Jacob
e972b55ec4
bug #953 - Fix prefetches in 3px4 product kernel
...
This gives a 10% speedup on nexus 4 and on nexus 5.
2015-02-13 14:52:36 -05:00
Gael Guennebaud
fc202bab39
Index refactoring: StorageIndex must be used for storage only (and locally when it make sense). In all other cases use the global Index type.
2015-02-13 18:57:41 +01:00
Gael Guennebaud
fe51319980
Merge Index-refactoring branch with default, fix PastixSupport, remove some useless typedefs
2015-02-13 10:03:53 +01:00
Gael Guennebaud
0918c51e60
merge Tensor module within Eigen/unsupported and update gemv BLAS wrapper
2015-02-12 21:48:41 +01:00
Gael Guennebaud
409547a0c8
update EIGEN_FAST_MATH documentation
2015-02-12 21:04:31 +01:00
Benoit Steiner
f669f5656a
Marked a few functions as EIGEN_DEVICE_FUNC to enable the use of tensors in cuda kernels.
2015-02-10 14:29:47 -08:00
Gael Guennebaud
029d236ceb
merge
2015-02-10 23:12:47 +01:00
Gael Guennebaud
fe25f3b8e3
FMA has been wrongly disabled
2015-02-10 23:11:35 +01:00
Benoit Steiner
cc5d7ff523
Added vectorized implementation of the exponential function for ARM/NEON
2015-02-10 14:02:38 -08:00
Gael Guennebaud
d4ec48575e
Make Block<SparseMatrix> inherit SparseCompressedBase in the case of an inner-panels and fix valuePtr() innerIndexPtr()
2015-02-09 11:14:36 +01:00
Gael Guennebaud
7838fda82c
Add a SparseCompressedBase class providing (un)compressed accessors (like data()/*Stride() for dense matrices),
...
and a CompressedAccessBit flag (similar to DirectAccessBit for dense matrices).
2015-02-07 22:00:46 +01:00
Benoit Steiner
01f7918788
Pulled latest fixes
2015-02-06 05:30:20 -08:00
Gael Guennebaud
b50ffaddf2
merge
2015-02-06 14:27:12 +01:00
Gael Guennebaud
74e460b995
Fix symmetric product
2015-02-06 14:26:24 +01:00
Benoit Steiner
c739102ef9
Pulled the latest changes from the trunk
2015-02-06 05:25:03 -08:00
Benoit Steiner
dcb2a8b184
Added the EIGEN_HAS_CONSTEXPR define
...
Gate the tensor index list code based on the value of EIGEN_HAS_CONSTEXPR
2015-02-06 02:51:59 -08:00
Benoit Jacob
5ef95fabee
bug #936 , patch 3/3: Properly detect FMA support on ARM (requires VFPv4)
...
and use it instead of MLA when available, because it's both more accurate,
and faster.
2015-01-30 17:45:03 -05:00
Benoit Jacob
0f21613698
bug #936 , patch 2/3: Remove EIGEN_VECTORIZE_FMA, was redundant with EIGEN_HAS_SINGLE_INSTRUCTION_MADD
2015-01-30 17:44:26 -05:00
Benoit Jacob
340b8afb14
bug #936 , patch 1.5/3: rename _FUSED_ macros to _SINGLE_INSTRUCTION_,
...
because this is what they are about. "Fused" means "no intermediate rounding
between the mul and the add, only one rounding at the end". Instead,
what we are concerned about here is whether a temporary register is needed,
i.e. whether the MUL and ADD are separate instructions.
Concretely, on ARM NEON, a single-instruction mul-add is always available: VMLA.
But a true fused mul-add is only available on VFPv4: VFMA.
2015-01-31 14:15:57 -05:00
Benoit Jacob
9f99f61e69
bug #936 , patch 1/3: some cleanup and renaming for consistency.
2015-01-30 17:43:56 -05:00
Benoit Jacob
759bd92a85
bug #935 : Add asm comments in GEBP kernels to work around a bug
...
in both GCC and Clang on ARM/NEON, whereby they spill registers,
severely harming performance. The reason why the asm comments
make a difference is that they prevent the compiler from
reordering code across these boundaries, which has the effect
of extending the lifetime of local variables and increasing
register pressure on this register-tight code.
2015-01-30 17:27:56 -05:00
Gael Guennebaud
c6eb84aabc
Enable vectorization of transposeInPlace for PacketSize x PacketSize matrices
2015-01-26 17:09:01 +01:00
Gael Guennebaud
e1f1091fde
Add support for dense ?= diagonal
2015-01-24 10:32:49 +01:00