Gael Guennebaud
0fe6b7d687
Make nestByValue works again (broken since 3.3) and add unit tests.
2019-01-17 18:27:25 +01:00
Gael Guennebaud
4b7cf7ff82
Extend reshaped unit tests and remove useless const_cast
2019-01-17 17:35:32 +01:00
Gael Guennebaud
b57c9787b1
Cleanup useless const_cast and add missing broadcast assignment tests
2019-01-17 16:55:42 +01:00
Patrick Peltzer
15e53d5d93
PR 567: makes all dense solvers inherit SoverBase (LU,Cholesky,QR,SVD).
...
This changeset also includes:
* add HouseholderSequence::conjugateIf
* define int as the StorageIndex type for all dense solvers
* dedicated unit tests, including assertion checking
* _check_solve_assertion(): this method can be implemented in derived solver classes to implement custom checks
* CompleteOrthogonalDecompositions: add applyZOnTheLeftInPlace, fix scalar type in applyZAdjointOnTheLeftInPlace(), add missing assertions
* Cholesky: add missing assertions
* FullPivHouseholderQR: Corrected Scalar type in _solve_impl()
* BDCSVD: Unambiguous return type for ternary operator
* SVDBase: Corrected Scalar type in _solve_impl()
2019-01-17 01:17:39 +01:00
Gael Guennebaud
7f32109c11
Add conjugateIf<bool> members to DesneBase, TriangularView, SelfadjointView, and make PartialPivLU use it.
2019-01-17 11:33:43 +01:00
Gael Guennebaud
562985bac4
bug #1646 : fix false aliasing detection for A.row(0) = A.col(0);
...
This changeset completely disable the detection for vectors for which are current mechanism cannot detect any positive aliasing anyway.
2019-01-17 00:14:27 +01:00
Rasmus Munk Larsen
7401e2541d
Fix compilation error for logical packet ops with older compilers.
2019-01-16 14:43:33 -08:00
Gael Guennebaud
0f028f61cb
GEBP: fix swapped kernel mode with AVX512 and complex scalars
2019-01-16 22:26:38 +01:00
Gael Guennebaud
e118ce86fd
GEBP: cleanup logic to choose between a 4 packets of 1 packet
2019-01-16 21:47:42 +01:00
Gael Guennebaud
70e133333d
bug #1661 : fix regression in GEBP and AVX512
2019-01-16 21:22:20 +01:00
Gael Guennebaud
502f717980
bug #1646 : disable aliasing detection for empty and 1x1 expression
2019-01-16 14:33:45 +01:00
Gael Guennebaud
0b466b6933
bug #1633 : use proper type for madd temporaries, factorize RhsPacketx4.
2019-01-16 13:50:13 +01:00
Renjie Liu
dbfcceabf5
Bug: 1633: refactor gebp kernel and optimize for neon
2019-01-16 12:51:36 +08:00
Gael Guennebaud
2c2c114995
Silent maybe-uninitialized warnings by gcc
2019-01-15 16:53:15 +01:00
Gael Guennebaud
6ec6bf0b0d
Enable visitor on empty matrices (the visitor is left unchanged), and protect min/maxCoeff(Index*,Index*) on empty matrices by an assertion (+ doc & unit tests)
2019-01-15 15:21:14 +01:00
Gael Guennebaud
027e44ed24
bug #1592 : makes partial min/max reductions trigger an assertion on inputs with a zero reduction length (+doc and tests)
2019-01-15 15:13:24 +01:00
Gael Guennebaud
f8bc5cb39e
Fix detection of vector-at-time: use Rows/Cols instead of MaxRow/MaxCols.
...
This fix VectorXd(n).middleCol(0,0).outerSize() which was equal to 1.
2019-01-15 15:09:49 +01:00
Gael Guennebaud
6cf7afa3d9
Typo
2019-01-15 11:04:37 +01:00
Rasmus Larsen
7b3aab0936
Merged in rmlarsen/eigen (pull request PR-570)
...
Add support for inverse hyperbolic functions. Fix cost of division.
2019-01-14 21:31:33 +00:00
Gael Guennebaud
250dcd1fdb
bug #1652 : fix position of EIGEN_ALIGN16 attributes in Neon and Altivec
2019-01-14 21:45:56 +01:00
Rasmus Larsen
5a59452aae
Merged eigen/eigen into default
2019-01-14 10:23:23 -08:00
Gael Guennebaud
3c9e6d206d
AVX512: fix pgather/pscatter for Packet4cd and unaligned pointers
2019-01-14 17:57:28 +01:00
Gael Guennebaud
61b6eb05fe
AVX512 (r)sqrt(double) was mistakenly disabled with clang and others
2019-01-14 17:28:47 +01:00
Gael Guennebaud
ccddeaad90
fix warning
2019-01-14 16:51:16 +01:00
Gael Guennebaud
4356a55a61
PR 571: Implements an accurate argument reduction algorithm for huge inputs of sin/cos and call it instead of falling back to std::sin/std::cos.
...
This makes both the small and huge argument cases faster because:
- for small inputs this removes the last pselect
- for large inputs only the reduction part follows a scalar path,
the rest use the same SIMD path as the small-argument case.
2019-01-14 13:54:01 +01:00
Rasmus Munk Larsen
1c6e6e2c3f
Merge.
2019-01-11 17:47:11 -08:00
Rasmus Munk Larsen
28ba1b2c32
Add support for inverse hyperbolic functions.
...
Fix cost of division.
2019-01-11 17:45:37 -08:00
Rasmus Munk Larsen
89c4001d6f
Fix warnings in ptrue for complex and half types.
2019-01-11 14:10:57 -08:00
Rasmus Munk Larsen
a49d01edba
Fix warnings in ptrue for complex and half types.
2019-01-11 13:18:17 -08:00
Rasmus Munk Larsen
df29511ac0
Fix merge.
2019-01-11 10:36:36 -08:00
Rasmus Munk Larsen
9396ace46b
Merge.
2019-01-11 10:28:52 -08:00
Rasmus Larsen
74882471d0
Merged eigen/eigen into default
2019-01-11 10:20:55 -08:00
Gael Guennebaud
9005f0111f
Replace compiler's alignas/alignof extension by respective c++11 keywords when available. This also fix a compilation issue with gcc-4.7.
2019-01-11 17:10:54 +01:00
Mark D Ryan
3c9add6598
Remove reinterpret_cast from AVX512 complex implementation
...
The reinterpret_casts used in ptranspose(PacketBlock<Packet8cf,4>&)
ptranspose(PacketBlock<Packet8cf,8>&) don't appear to be working
correctly. They're used to convert the kernel parameters to
PacketBlock<Packet8d,T>& so that the complex number versions of
ptranspose can be written using the existing double implementations.
Unfortunately, they don't seem to work and are responsible for 9 unit
test failures in the AVX512 build of tensorflow master. This commit
fixes the issue by manually initialising PacketBlock<Packet8d,T>
variables with the contents of the kernel parameter before calling
the double version of ptranspose, and then copying the resulting
values back into the kernel parameter before returning.
2019-01-11 14:02:09 +01:00
Rasmus Munk Larsen
fcfced13ed
Rename pones -> ptrue. Use _CMP_TRUE_UQ where appropriate.
2019-01-09 17:20:33 -08:00
Rasmus Munk Larsen
e15bb785ad
Collapsed revision
...
* Add packet up "pones". Write pnot(a) as pxor(pones(a), a).
* Collapsed revision
* Simplify a bit.
* Undo useless diffs.
* Fix typo.
2019-01-09 16:34:23 -08:00
Rasmus Munk Larsen
f6ba6071c5
Fix typo.
2019-01-09 16:34:23 -08:00
Rasmus Munk Larsen
8f04442526
Collapsed revision
...
* Collapsed revision
* Add packet up "pones". Write pnot(a) as pxor(pones(a), a).
* Collapsed revision
* Simplify a bit.
* Undo useless diffs.
* Fix typo.
2019-01-09 16:34:23 -08:00
Rasmus Munk Larsen
e00521b514
Undo useless diffs.
2019-01-09 16:32:53 -08:00
Rasmus Munk Larsen
f2767112c8
Simplify a bit.
2019-01-09 16:29:18 -08:00
Rasmus Munk Larsen
cb955df9a6
Add packet up "pones". Write pnot(a) as pxor(pones(a), a).
2019-01-09 16:17:08 -08:00
Rasmus Larsen
cb3c059fa4
Merged eigen/eigen into default
2019-01-09 15:04:17 -08:00
Gael Guennebaud
d812f411c3
bug #1654 : fix compilation with cuda and no c++11
2019-01-09 18:00:05 +01:00
Gael Guennebaud
3492a1ca74
fix plog(+inf) with AVX512
2019-01-09 16:53:37 +01:00
Gael Guennebaud
47810cf5b7
Add dedicated implementations of predux_any for AVX512, NEON, and Altivec/VSE
2019-01-09 16:40:42 +01:00
Gael Guennebaud
3f14e0d19e
fix warning
2019-01-09 15:45:21 +01:00
Gael Guennebaud
aeec68f77b
Add missing pcmp_lt and others for AVX512
2019-01-09 15:36:41 +01:00
Gael Guennebaud
e6b217b8dd
bug #1652 : implements a much more accurate version of vectorized sin/cos. This new version achieve same speed for SSE/AVX, and is slightly faster with FMA. Guarantees are as follows:
...
- no FMA: 1ULP up to 3pi, 2ULP up to sin(25966) and cos(18838), fallback to std::sin/cos for larger inputs
- FMA: 1ULP up to sin(117435.992) and cos(71476.0625), fallback to std::sin/cos for larger inputs
2019-01-09 15:25:17 +01:00
Rasmus Munk Larsen
055f0b73db
Add support for pcmp_eq and pnot, including for complex types.
2019-01-07 16:53:36 -08:00
Eugene Zhulenev
190d053e41
Explicitly set fill character when printing aligned data to ostream
2019-01-03 14:55:28 -08:00