eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-07-03 03:35:11 +08:00

Author	SHA1	Message	Date
Eugene Zhulenev	385b3ff12f	Merged latest changes from upstream/eigen	2018-08-01 11:59:04 -07:00
Benoit Steiner	17221115c9	Merged in codeplaysoftware/eigen-upstream-pure/eigen_variadic_assert (pull request PR-447) Adding variadic version of assert which can take a parameter pack as its input.	2018-08-01 16:41:54 +00:00
Benoit Steiner	0360c36170	Merged in codeplaysoftware/eigen-upstream-pure/separating_internal_memory_allocation (pull request PR-446) Distinguishing between internal memory allocation/deallocation from explicit user memory allocation/deallocation.	2018-08-01 16:13:15 +00:00
Mehdi Goli	c6a5c70712	Correcting the position of allocate_temp/deallocate_temp in TensorDeviceGpu.h	2018-08-01 16:56:26 +01:00
Benoit Steiner	45f75f1ace	Merged in codeplaysoftware/eigen-upstream-pure/using_PacketType_class (pull request PR-449) Enabling per device specialisation of packetSize.	2018-08-01 15:43:03 +00:00
Mehdi Goli	af96018b49	Using the suggested modification.	2018-08-01 16:04:44 +01:00
Mehdi Goli	b512a9536f	Enabling per device specialisation of packetsize.	2018-08-01 13:39:13 +01:00
Mehdi Goli	3a197a60e6	variadic version of assert which can take a parameter pack as its input.	2018-08-01 12:19:14 +01:00
Mehdi Goli	d7a8414848	Distinguishing between internal memory allocation/deallocation from explicit user memory allocation/deallocation.	2018-08-01 11:56:30 +01:00
Mehdi Goli	9e219bb3d3	Converting ad-hoc inline keyword to EIGEN_STRONG_INLINE MACRO.	2018-08-01 10:47:49 +01:00
Eugene Zhulenev	83c0a16baf	Add block evaluation support to TensorOps	2018-07-31 15:56:31 -07:00
Benoit Steiner	edf46bd7a2	Merged in yuefengz/eigen (pull request PR-370) Use device's allocate function instead of internal::aligned_malloc.	2018-07-31 22:38:28 +00:00
Paul Tucker	385f7b8d0c	Change getAllocator() to allocator() in ThreadPoolDevice.	2018-07-31 13:52:18 -07:00
Mark D Ryan	6f5b126e6d	Fix tensor contraction for AVX512 machines This patch modifies the TensorContraction class to ensure that the kc_ field is always a multiple of the packet_size, if the packet_size is > 8. Without this change spatial convolutions in Tensorflow do not work properly as the code that re-arranges the input matrices can assert if kc_ is not a multiple of the packet_size. This leads to a unit test failure, //tensorflow/python/kernel_tests:conv_ops_test, on AVX512 builds of tensorflow.	2018-07-31 09:33:37 +01:00
Gael Guennebaud	678a0dcb12	Merged in ezhulenev/eigen/tiling_3 (pull request PR-438) Tiled tensor executor	2018-07-31 08:13:00 +00:00
Gael Guennebaud	679eece876	Speedup trivial tensor broadcasting on GPU by enforcing unaligned loads. See PR 437.	2018-07-31 10:10:14 +02:00
Eugene Zhulenev	966c2a7bb6	Rename Index to StorageIndex + use Eigen::Array and Eigen::Map when possible	2018-07-27 12:45:17 -07:00
Eugene Zhulenev	6913221c43	Add tiled evaluation support to TensorExecutor	2018-07-25 13:51:10 -07:00
Rasmus Munk Larsen	e478532625	Reduce the number of template specializations of classes related to tensor contraction to reduce binary size.	2018-07-27 12:36:34 -07:00
Christoph Hertzberg	5e79402b4a	fix warnings for doc-eigen-prerequisites	2018-07-24 21:59:15 +02:00
Christoph Hertzberg	5f79b7f9a9	Removed several shadowing types and use global Index typedef everywhere	2018-07-25 21:47:45 +02:00
Christoph Hertzberg	44ee201337	Rename variable which shadows class name	2018-07-25 20:26:15 +02:00
Gustavo Lima Chaves	705f66a9ca	Account for missing change on commit "Remove SimpleThreadPool and..." "... always use {NonBlocking}ThreadPool". It seems the non-blocking implementation was me the default/only one, but a reference to the old name was left unmodified. Fix that.	2018-07-23 16:29:09 -07:00
Eugene Zhulenev	d55efa6f0f	TensorBlockIO	2018-07-23 15:50:55 -07:00
Eugene Zhulenev	34a75c3c5c	Initial support of TensorBlock	2018-07-20 17:37:20 -07:00
Gustavo Lima Chaves	02eaaacbc5	Move cxx11_tensor_uint128 test under an EIGEN_TEST_CXX11 guarded block Builds configured without the -DEIGEN_TEST_CXX11=ON flag would fail right away without this, as this test seems to rely on those language features. The skip under compilation with MSVC was kept.	2018-07-20 16:08:40 -07:00
Paul Tucker	d4afccde5a	Add test coverage for ThreadPoolDevice optional allocator.	2018-07-19 17:43:44 -07:00
Eugene Zhulenev	c58b874727	PR430: Convert count to the reducer type in MeanReducer Without explicit conversion Tensorflow fails to compile, pset1 template deduction fails. cannot convert '((const Eigen::internal::MeanReducer<Eigen::half>*)this) ->Eigen::internal::MeanReducer<Eigen::half>::packetCount_' (type 'const DenseIndex {aka const long int}') to type 'const type& {aka const Eigen::half&}' return pdiv(vaccum, pset1<Packet>(packetCount_)); Honestly I’m not sure why it works in Eigen tests, because Eigen::half constructor is explicit, and why it stopped working in TF, I didn’t find any relevant changes since previous Eigen upgrade. static_cast<T>(packetCount_) - breaks cxx11_tensor_reductions test for Eigen::half, also quite surprising.	2018-07-19 17:37:03 -07:00
Paul Tucker	4e9848fa86	Actually add optional Allocator* arg to ThreadPoolDevice().	2018-07-16 17:53:36 -07:00
Paul Tucker	b3e7c9132d	Add optional Allocator argument to ThreadPoolDevice constructor. When supplied, this allocator will be used in place of internal::aligned_malloc. This permits e.g. use of a NUMA-node specific allocator where the thread-pool is also restricted a single NUMA-node.	2018-07-16 17:26:05 -07:00
Gael Guennebaud	add5757488	Simplify handling and non-splitted tests and include split_test_helper.h instead of re-generating it. This also allows us to modify it without breaking existing build folder.	2018-07-16 18:55:40 +02:00
Gael Guennebaud	901c7d31f0	Fix usage of EIGEN_SPLIT_LARGE_TESTS=ON: some unit tests, such as indexed_view have to be split unconditionally.	2018-07-16 18:35:05 +02:00
Rasmus Munk Larsen	3a9cf4e290	Get rid of alias for m_broadcast.	2018-07-13 16:24:48 -07:00
Rasmus Munk Larsen	4222550e17	Optimize the case where broadcasting is a no-op.	2018-07-13 16:12:38 -07:00
Gael Guennebaud	1920129d71	Remove clang warning	2018-07-13 16:05:35 +02:00
Gael Guennebaud	06eb24cf4d	Introduce gpu_assert for assertion in device-code, and disable them with clang-cuda.	2018-07-13 16:04:27 +02:00
David Hyde	d908afe35f	bug #1558 : fix a corner case in MINRES when both v_new and w_new vanish.	2018-07-08 22:06:38 -07:00
Eugene Zhulenev	6e654f3379	Reduce number of allocations in TensorContractionThreadPool.	2018-07-16 14:26:39 -07:00
Gael Guennebaud	7ccb623746	bug #1569 : fix Tensor<half>::mean() on AVX with respective unit test.	2018-07-19 13:15:40 +02:00
Eugene Zhulenev	e3c2d61739	Assert that no output kernel is defined for GPU contraction	2018-07-18 14:34:22 -07:00
Eugene Zhulenev	79d4129cce	Specify default output kernel for TensorContractionOp	2018-07-18 14:21:01 -07:00
Gael Guennebaud	44ea5f7623	Add unit test for -Tensor<complex> on GPU	2018-07-12 17:19:38 +02:00
Thales Sabino	9a6a43319f	Fix cxx11_tensor_fft not building on Windows. The type used in Eigen::DSizes needs to be at least 8 bytes long. Internally Tensor tries to convert this to an __int64 on Windows and this fails to build. On Linux, long and long long are both 8 byte integer types. * * * Changing from "long long" to "std::int64_t".	2018-07-12 11:20:59 +01:00
Gael Guennebaud	b347eb0b1c	Fix doc	2018-07-12 11:56:18 +02:00
Yuefeng Zhou	1eff6cf8a7	Use device's allocate function instead of internal::aligned_malloc. This would make it easier to track memory usage in device instances.	2018-02-20 16:50:05 -08:00
Gael Guennebaud	6cd6551b26	Add deprecated header files for TensorFlow	2018-07-12 10:50:53 +02:00
Deven Desai	876f392c39	Updates corresponding to the latest round of PR feedback The major changes are 1. Moving CUDA/PacketMath.h to GPU/PacketMath.h 2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h 3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h The above three changes effectively enable the Eigen "Packet" layer for the HIP platform 4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic") 5. Updating the "EIGEN_DEVICE_FUNC" marking in some places The change has been tested on the HIP and CUDA platforms.	2018-07-11 10:39:54 -04:00
Deven Desai	471cfe5ff7	renaming CUDA* to GPU* for some header files	2018-07-11 09:22:04 -04:00
Deven Desai	38807a2575	merging updates from upstream	2018-07-11 09:17:33 -04:00
Gael Guennebaud	6190aa5632	bug #1567 : add optimized path for tensor broadcasting and 'Channel First' shape	2018-07-09 11:23:16 +02:00

... 13 14 15 16 17 ...

3122 Commits