124 Commits

Author SHA1 Message Date
Gael Guennebaud
5679e439e0 bug #1543: fix linear indexing in generic block evaluation (this completes the fix in commit 12efc7d41b80259b996be5781bf596c249c90d3f
)
2018-04-23 14:40:16 +02:00
Gael Guennebaud
12efc7d41b Fix linear indexing in generic block evaluation. 2018-02-09 16:45:49 +01:00
Gael Guennebaud
73629f8b68 Fix gcc7 warning 2018-01-09 08:59:27 +01:00
Gael Guennebaud
9c3aed9d48 Fix packet and alignment propagation logic of Block<Xpr> expressions. In particular, (A+B).col(j) lost vectorisation. 2017-12-14 14:24:33 +01:00
Benoit Steiner
09ae0e6586 Adjusted the EIGEN_DEVICE_FUNC qualifiers to make sure that:
* they're used consistently between the declaration and the definition of a function
  * we avoid calling host only methods from host device methods.
2017-03-01 11:47:47 -08:00
Benoit Steiner
c1d87ec110 Added missing EIGEN_DEVICE_FUNC qualifiers 2017-03-01 10:08:50 -08:00
Gael Guennebaud
296d24be4d bug #1381: fix sparse.diagonal() used as a rvalue.
The problem was that is "sparse" is not const, then sparse.diagonal() must have the
LValueBit flag meaning that sparse.diagonal().coeff(i) must returns a const reference,
const Scalar&. However, sparse::coeff() cannot returns a reference for a non-existing
zero coefficient. The trick is to return a reference to a local member of
evaluator<SparseMatrix>.
2017-01-25 17:39:01 +01:00
Gael Guennebaud
11f55b2979 Optimize storage layout of Cwise* and PlainObjectBase evaluator to remove the functor or outer-stride if they are empty.
For instance, sizeof("(A-B).cwiseAbs2()") with A,B Vector4f is now 16 bytes, instead of 48 before this optimization.
In theory, evaluators should be completely optimized away by the compiler, but this might help in some cases.
2016-12-20 15:55:40 +01:00
Gael Guennebaud
5271474b15 Remove common "noncopyable" base class from evaluator_base to get a chance to get EBO (Empty Base Optimization)
Note: we should probbaly get rid of this class and define a macro instead.
2016-12-20 15:51:30 +01:00
Gael Guennebaud
ca6a2a5248 Fix warning with ICC 2016-10-26 14:13:05 +02:00
Gael Guennebaud
9d6d0dff8f bug #1317: fix performance regression with some Block expressions and clang by helping it to remove dead code.
The trick is to get rid of the nested expression in the evaluator by copying only the required information (here, the strides).
2016-10-01 15:37:00 +02:00
Gael Guennebaud
447f269561 Disable previous workaround. 2016-09-06 15:49:02 +02:00
Gael Guennebaud
b046a3f87d Workaround MSVC instantiation faillure of has_*ary_operator at the level of triats<Ref>::match so that the has_*ary_operator are really properly instantiated throughout the compilation unit. 2016-09-06 15:47:04 +02:00
Gael Guennebaud
19a95b3309 Fix shadowing wrt Eigen::Index 2016-09-05 17:19:47 +02:00
Gael Guennebaud
e13071dd13 Workaround a weird msvc 2012 compilation error. 2016-09-05 15:50:41 +02:00
Gael Guennebaud
d123717e21 Fix for msvc 2012 and older 2016-09-05 15:26:56 +02:00
Benoit Steiner
5a6be66cef Turned the Index type used by the nullary wrapper into a template parameter. 2016-09-02 14:10:29 -07:00
Gael Guennebaud
f9f32e9e2d Fix compilation with nvcc 2016-09-01 13:06:14 +02:00
Gael Guennebaud
218c37beb4 bug #1286: automatically detect the available prototypes of functors passed to CwiseNullaryExpr such that functors have only to implement the operators that matters among:
operator()()
 operator()(i)
 operator()(i,j)
Linear access is also automatically detected based on the availability of operator()(i,j).
2016-08-31 15:45:25 +02:00
Eugene Brevdo
39baff850c Add TernaryFunctors and the betainc SpecialFunction.
TernaryFunctors and their executors allow operations on 3-tuples of inputs.
API fully implemented for Arrays and Tensors based on binary functors.

Ported the cephes betainc function (regularized incomplete beta
integral) to Eigen, with support for CPU and GPU, floats, doubles, and
half types.

Added unit tests in array.cpp and cxx11_tensor_cuda.cu


Collapsed revision
* Merged helper methods for betainc across floats and doubles.
* Added TensorGlobalFunctions with betainc().  Removed betainc() from TensorBase.
* Clean up CwiseTernaryOp checks, change igamma_helper to cephes_helper.
* betainc: merge incbcf and incbd into incbeta_cfe.  and more cleanup.
* Update TernaryOp and SpecialFunctions (betainc) based on review comments.
2016-06-02 17:04:19 -07:00
Gael Guennebaud
27f0434233 Introduce internal's UIntPtr and IntPtr types for pointer to integer conversions.
This fixes "conversion from pointer to same-sized integral type" warnings by ICC.
Ideally, we would use the std::[u]intptr_t types all the time, but since they are C99/C++11 only,
let's be safe.
2016-05-26 10:52:12 +02:00
Gael Guennebaud
78390e4189 Block<> should not disable vectorization based on inner-size, this is the responsibilty of the assignment logic. 2016-05-24 17:14:01 +02:00
Gael Guennebaud
6a2916df80 DiagonalWrapper is a vector, so it must expose the LinearAccessBit flag. 2016-05-19 13:06:21 +02:00
Gael Guennebaud
b1bd53aa6b Fix performance regression: with AVX, unaligned stores were emitted instead of aligned ones for fixed size assignement. 2016-05-01 23:25:06 +02:00
Gael Guennebaud
e9bea614ec Fix shortcoming in fixed-value deduction of startRow/startCol 2016-02-29 10:31:27 +01:00
Gael Guennebaud
d142165942 bug #667: declare several critical functions as FORECE_INLINE to make ICC happier.
<g.gael@free.fr> HG: branch 'default' HG: changed Eigen/src/Core/ArrayBase.h HG: changed Eigen/src/Core/AssignEvaluator.h HG: changed
Eigen/src/Core/CoreEvaluators.h HG: changed Eigen/src/Core/CwiseUnaryOp.h HG: changed Eigen/src/Core/DenseBase.h HG: changed Eigen/src/Core/MatrixBase.h
2016-01-31 16:34:10 +01:00
Gael Guennebaud
df15fbc452 bug #1158: PartialReduxExpr is a vector expression, and it thus must expose the LinearAccessBit flag 2016-01-28 13:16:30 +01:00
Gael Guennebaud
8b9dc9f0df bug #1144: fix regression in x=y+A*x (aliasing), and move evaluator_traits::AssumeAliasing to evaluator_assume_aliasing. 2016-01-09 08:30:38 +01:00
Gael Guennebaud
df6f54ff63 Fix storage order of PartialRedux 2015-12-10 22:24:58 +01:00
Gael Guennebaud
0bb12fa614 Add LU::transpose().solve() and LU::adjoint().solve() API. 2015-12-01 14:38:47 +01:00
Gael Guennebaud
91a7059459 bug #1009, part 1/2: make sure vector expressions expose LinearAccessBit flag. 2015-11-27 10:06:07 +01:00
Gael Guennebaud
77ff3386b7 Refactoring of the cost model:
- Dynamic is now an invalid value
 - introduce a HugeCost constant to be used for runtime-cost values or arbitrarily huge cost
 - add sanity checks for cost values: must be >=0 and not too large
This change provides several benefits:
 - it fixes shortcoming is some cost computation where the Dynamic case was not properly handled.
 - it simplifies cost computation logic, and should avoid future similar shortcomings.
 - it allows to distinguish between different level of dynamic/huge/infinite cost
 - it should enable further simplifications in the computation of costs (save compilation time)
2015-10-28 11:42:14 +01:00
Gael Guennebaud
8c66b6bc61 Simplify evaluator::Flags for Map<> 2015-10-27 11:06:42 +01:00
Gael Guennebaud
dd934ad057 Re-enable vectorization of LinSpaced, plus some cleaning 2015-10-08 17:27:01 +02:00
Gael Guennebaud
f6f6f50272 Clean evaluator<EvalToTemp> 2015-10-08 16:34:33 +02:00
Gael Guennebaud
aa6b1aebf3 Properly implement PartialReduxExpr on top of evaluators, and fix multiple evaluation of nested expression 2015-10-08 15:57:05 +02:00
Gael Guennebaud
5cc7251188 Some cleaning in evaluators 2015-10-08 15:22:04 +02:00
Gael Guennebaud
941a99ac1a Add a few missing EIGEN_DEVICE_FUNC declarations 2015-09-03 14:14:54 +02:00
Gael Guennebaud
aa768add0b Since there is no reason for evaluators to be nested by reference, let's remove the evaluator<>::nestedType indirection. 2015-09-02 22:10:39 +02:00
Gael Guennebaud
f8976fdbe0 Make evaluators non-copyable. This guarantee that evaluators storing temporaries do not introduce unwanted copy overhead. 2015-09-02 21:39:49 +02:00
Gael Guennebaud
92b9f0e102 Cleaning pass on evaluators: remove the useless and error prone evaluator<>::type indirection. 2015-09-02 21:38:40 +02:00
Gael Guennebaud
65bfa5fce7 Allow to use arbitrary packet-types during evaluation.
This is implemented by adding a PacketType template parameter to packet and writePacket members of evaluator<>.
2015-08-07 12:01:39 +02:00
Gael Guennebaud
ce57dbd937 Let unpacket_traits<> exposes the required alignment and make use of it everywhere 2015-08-07 10:44:01 +02:00
Gael Guennebaud
1f5024332e First part of a big refactoring of alignment control to enable the handling of arbitrarily aligned buffers. It includes:
- AlignedBit flag is deprecated. Alignment is now specified by the evaluator through the 'Alignment' enum, e.g., evaluator<Xpr>::Alignment. Its value is in Bytes.
 - Add several enums to specify alignment: Aligned8, Aligned16, Aligned32, Aligned64, Aligned128. AlignedMax corresponds to EIGEN_MAX_ALIGN_BYTES. Such enums are used to define the above Alignment value, and as the 'Options' template parameter of Map<> and Ref<>.
 - The Aligned enum is now deprecated. It is now an alias for Aligned16.
 - Currently, traits<Matrix<>>, traits<Array<>>, traits<Ref<>>, traits<Map<>>, and traits<Block<>> also expose the Alignment enum.
2015-08-06 15:31:07 +02:00
Gael Guennebaud
175ed636ea bug #973: update macro-level control of alignement by introducing user-controllable EIGEN_MAX_ALIGN_BYTES and EIGEN_MAX_STATIC_ALIGN_BYTES macros. This changeset also removes EIGEN_ALIGN (replaced by EIGEN_MAX_ALIGN_BYTES>0), EIGEN_ALIGN_STATICALLY (replaced by EIGEN_MAX_STATIC_ALIGN_BYTES>0), EIGEN_USER_ALIGN*, EIGEN_ALIGN_DEFAULT (replaced by EIGEN_ALIGN_MAX). 2015-07-29 10:22:25 +02:00
Gael Guennebaud
3f6aa4cd5d Remove useless specializations of evaluator_traits 2015-06-19 14:18:29 +02:00
Gael Guennebaud
cbe3a1a83e Add missing accessors for 1D index based access to Replicate<> expressions. 2015-06-08 15:39:09 +02:00
Benoit Jacob
1dd3d89818 Fix a unused-var warning 2015-03-15 18:07:19 -04:00
Gael Guennebaud
1330f8bbd1 bug #973, improve AVX support by enabling vectorization of Vector4i-like types, and enforcing alignement of Vector4f/Vector2d-like types to preserve compatibility with SSE and future Eigen versions that will vectorize them with AVX enabled. 2015-03-13 21:15:50 +01:00
Gael Guennebaud
fe51319980 Merge Index-refactoring branch with default, fix PastixSupport, remove some useless typedefs 2015-02-13 10:03:53 +01:00