Guoqiang QI
821702e771
Fix the #issue1997 and #issue1991 bug triggered by unsupport a[index](type a: __i28d) ops with MSVC compiler
2020-09-21 15:49:00 +00:00
Rasmus Munk Larsen
c4b99f78c7
Fix breakage in pcast<Packet2l, Packet2d> due to _mm_cvtsi128_si64 not being available on 32 bit x86.
...
If SSE 4.1 is available use the faster _mm_extract_epi64 intrinsic.
2020-09-18 18:13:20 -07:00
guoqiangqi
9aad16b443
Fix undefined reference to pset1frombits bug on different platforms
2020-09-19 00:53:21 +00:00
David Tellenbach
c4aa8e0db2
Rename variable to avoid shadowing of a previously declared one
2020-09-18 22:53:15 +02:00
Rasmus Munk Larsen
e55182ac09
Get rid of initialization logic for blueNorm by making the computed constants static const or constexpr.
...
Move macro definition EIGEN_CONSTEXPR to Core and make all methods in NumTraits constexpr when EIGEN_HASH_CONSTEXPR is 1.
2020-09-18 17:38:58 +00:00
Rasmus Munk Larsen
14022f5eb5
Fix more mildly embarrassing typos in ARM intrinsics in PacketMath.h.
...
'vmvnq_u64' does not exist for some reason.
2020-09-18 04:14:13 +00:00
Rasmus Munk Larsen
a5b226920f
Fix typo in PacketMath.h
2020-09-18 01:22:23 +00:00
Rasmus Munk Larsen
3af744b023
Add missing packet op pcmp_lt_or_nan for Packet2d on ARM.
2020-09-18 01:07:01 +00:00
Rasmus Munk Larsen
31a6b88ff3
Disable double version of compute_inverse_size4 on Inverse_NEON.h if Packet2d is not supported.
2020-09-17 23:51:06 +00:00
Brad King
880fa43b2b
Add support for CastXML on ARM aarch64
...
CastXML simulates the preprocessors of other compilers, but actually
parses the translation unit with an internal Clang compiler.
Use the same `vld1q_u64` workaround that we do for Clang.
Fixes : #1979
2020-09-16 13:40:23 -04:00
daravi
6f0f6f792e
Fix compiler error due to c++20 operator== generation rules
2020-09-16 02:06:53 +00:00
Benoit Jacob
cc0c38ace8
Remove old Clang compiler bug work-arounds. The two LLVM bugs referenced in the comments here have long been fixed. The workarounds were now detrimental because (1) they prevented using fused mul-add on Clang/ARM32 and (2) the unnecessary 'volatile' in 'asm volatile' prevented legitimate reordering by the compiler.
2020-09-15 20:54:14 -04:00
Tim Shen
bb56a62582
Make bfloat16(float(-nan)) produce -nan, not nan.
2020-09-15 13:24:23 -07:00
Guoqiang QI
3012e755e9
Add plog ops support packet2d for NEON
2020-09-15 17:10:35 +00:00
Rasmus Munk Larsen
e4fb0ddf78
Add EIGEN_UNUSED_VARIABLE to unused variable in Memory.h
2020-09-15 01:18:55 +00:00
Pedro Caldeira
65e400896b
Fix bfloat16 round on gcc 4.8
2020-09-14 10:43:59 -03:00
Rasmus Munk Larsen
5636f80d11
Fix issue #1968 . Don't discard return value from "new" in C++17.
2020-09-13 17:38:45 +00:00
Guoqiang QI
7c5d48f313
Unified sse pldexp_double api
2020-09-12 10:56:55 +00:00
Rasmus Munk Larsen
71e08c702b
Make blueNorm threadsafe if C++11 atomics are available.
2020-09-12 01:23:29 +00:00
Niels Dekker
5328c9be43
Fix half_impl::float_to_half_rtne(float) warning: '<<' causes overflow
...
Fixed Visual Studio 2019 Code Analysis (C++ Core Guidelines) warning
C26450 from inside `half_impl::float_to_half_rtne(float)`:
> Arithmetic overflow: '<<' operation causes overflow at compile time.
2020-09-10 16:22:28 +02:00
Pedro Caldeira
35d149e34c
Add missing functions for Packet8bf in Altivec architecture.
...
Including new tests for bfloat16 Packets.
Fix prsqrt on GenericPacketMath.
2020-09-08 09:22:11 -05:00
Guoqiang QI
85428a3440
Add Neon psqrt<Packet2d> and pexp<Packet2d>
2020-09-08 09:04:03 +00:00
Alexander Neumann
5272106826
remove semi triggering -Wextra-semi-stmt
2020-09-07 11:42:30 +02:00
Stephen Zheng
5f25bcf7d6
Add Inverse_NEON.h
...
Implemented fast size-4 matrix inverse (mimicking Inverse_SSE.h) using NEON intrinsics.
```
Benchmark Time CPU Time Old Time New CPU Old CPU New
--------------------------------------------------------------------------------------------------------
BM_float -0.1285 -0.1275 568 495 572 499
BM_double -0.2265 -0.2254 638 494 641 496
```
2020-09-04 10:55:47 +00:00
Everton Constantino
6fe88a3c9d
MatrixProuct enhancements:
...
- Changes to Altivec/MatrixProduct
Adapting code to gcc 10.
Generic code style and performance enhancements.
Adding PanelMode support.
Adding stride/offset support.
Enabling float64, std::complex and std::complex.
Fixing lack of symm_pack.
Enabling mixedtypes.
- Adding std::complex tests to blasutil.
- Adding an implementation of storePacketBlock when Incr!= 1.
2020-09-02 18:21:36 -03:00
Everton Constantino
6568856275
Changing u/int8_t to un/signed char because clang does not understand
...
it.
Implementing pcmp_eq to Packet8 and Packet16.
2020-09-02 17:02:15 -03:00
Gael Guennebaud
27e6648074
fix #1901 : warning in Mode==(Upper|Lower)
2020-09-02 15:43:58 +02:00
Chip Kerchner
e5886457c8
Change Packet8s and Packet8us to use vector commands on Power for pmadd, pmul and psub.
2020-08-28 19:27:32 +00:00
Gael Guennebaud
25424d91f6
Fix #1974 : assertion when reserving an empty sparse matrix
2020-08-26 12:32:20 +02:00
Guoqiang QI
8bb0febaf9
add psqrt ops support packet2f/packet4f for NEON
2020-08-21 03:17:15 +00:00
Georg Jäger
1b1082334b
adding attributes to constructors to support hip-clang on ROCm 3.5
2020-08-20 16:48:11 +02:00
Deven Desai
603e213d13
Fixing a CUDA / P100 regression introduced by PR 181
...
PR 181 ( https://gitlab.com/libeigen/eigen/-/merge_requests/181 ) adds `__launch_bounds__(1024)` attribute to GPU kernels, that did not have that attribute explicitly specified.
That PR seems to cause regressions on the CUDA platform. This PR/commit makes the changes in PR 181, to be applicable for HIP only
2020-08-20 00:29:57 +00:00
Rasmus Munk Larsen
d10b27fe37
Add missing inline keyword in Quaternion.h.
2020-08-14 17:51:04 +00:00
David Tellenbach
c6820a6316
Replace the call to int64_t in the blasutil test by explicit types
...
Some platforms define int64_t to be long long even for C++03. If this is
the case we miss the definition of internal::make_unsigned for this
type. If we just define the template we get duplicated definitions
errors for platforms defining int64_t as signed long for C++03.
We need to find a way to distinguish both cases at compile-time.
2020-08-14 17:24:37 +02:00
David Tellenbach
8ba1b0f41a
bfloat16 packetmath for Arm Neon backend
2020-08-13 15:48:40 +00:00
Pedro Caldeira
704798d1df
Add support for Bfloat16 to use vector instructions on Altivec
...
architecture
2020-08-10 13:22:01 -05:00
Zachary Garrett
21122498ec
Temporarily turn off the NEON implementation of pfloor as it does not work for large values.
...
The NEON implementation mimics the SSE implementation, but didn't mention the caveat that due to the unsigned of signed integer conversions, not all values in the original floating point represented are supported.
2020-08-04 16:28:23 +00:00
David Tellenbach
5e484fa11d
Fix StlDeque for GCC 10
...
StlDeque extends std::deque by accessing some of its internal members.
Since GCC 10 these are not accessible anymore.
2020-07-29 12:31:13 +00:00
Teng Lu
3ec4f0b641
Fix undefine BF16 union behavior in AVX512.
2020-07-29 02:20:21 +00:00
David Tellenbach
99da2e1a8d
Fix clang-tidy warnings in generic bfloat16 implementation
...
See !172 for related discussions.
2020-07-27 16:00:24 +02:00
David Tellenbach
c1ffe452fc
Fix bfloat16 casts
...
If we have explicit conversion operators available (C++11) we define
explicit casts from bfloat16 to other types. If not (C++03), we don't
define conversion operators but rely on implicit conversion chains from
bfloat16 over float to other types.
2020-07-23 20:55:06 +00:00
Rasmus Munk Larsen
1b84f21e32
Revert change that made conversion from bfloat16 to {float, double} implicit.
...
Add roundtrip tests for casting between bfloat16 and complex types.
2020-07-22 18:09:00 -07:00
David Tellenbach
38b91f256b
Fix cast of blfoat16 to std::complex<T>
...
This fixes https://gitlab.com/libeigen/eigen/-/issues/1951
2020-07-22 19:00:17 +00:00
Rasmus Munk Larsen
bed7fbe854
Make sure we take the little-endian path if __BYTE_ORDER__ is not defined.
2020-07-22 18:54:38 +00:00
Niels Dekker
0e1a33a461
Faster conversion from integer types to bfloat16
...
Specialized `bfloat16_impl::float_to_bfloat16_rtne(float)` for normal floating point numbers, infinity and zero, in order to improve the performance of `bfloat16::bfloat16(const T&)` for integer argument types.
A reduction of more than 20% of the runtime duration of conversion from int to bfloat16 was observed, using Visual C++ 2019 on Windows 10.
2020-07-22 19:25:49 +02:00
Rasmus Munk Larsen
acab22c205
Avoid division by zero in nonZerosEstimate() for empty blocks.
2020-07-22 01:38:30 +00:00
Rasmus Munk Larsen
0aeaf5f451
Make numext::as_uint a device function.
2020-07-22 00:33:41 +00:00
Alexander Turkin
60faa9f897
user-defined copy operations removed in favor of compiler-generated ones
2020-07-20 14:59:35 +03:00
Niels Dekker
b11f817bcf
Avoid undefined behavior by union type punning in float_to_bfloat16_rtne
...
Use `numext::as_uint`, instead of union based type punning, to avoid undefined behavior.
See also C++ Core Guidelines: "Don't use a union for type punning"
https://github.com/isocpp/CppCoreGuidelines/blob/v0.8/CppCoreGuidelines.md#c183-dont-use-a-union-for-type-punning
`numext::as_uint` was suggested by David Tellenbach
2020-07-14 19:55:20 +02:00
Sheng Yang
56b3e3f3f8
AVX path for BF16
2020-07-14 01:34:03 +00:00