eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-10-16 18:11:29 +08:00

Author	SHA1	Message	Date
Mehdi Goli	d7a8414848	Distinguishing between internal memory allocation/deallocation from explicit user memory allocation/deallocation.	2018-08-01 11:56:30 +01:00
Mehdi Goli	9e219bb3d3	Converting ad-hoc inline keyword to EIGEN_STRONG_INLINE MACRO.	2018-08-01 10:47:49 +01:00
Eugene Zhulenev	83c0a16baf	Add block evaluation support to TensorOps	2018-07-31 15:56:31 -07:00
Benoit Steiner	edf46bd7a2	Merged in yuefengz/eigen (pull request PR-370) Use device's allocate function instead of internal::aligned_malloc.	2018-07-31 22:38:28 +00:00
Paul Tucker	385f7b8d0c	Change getAllocator() to allocator() in ThreadPoolDevice.	2018-07-31 13:52:18 -07:00
Mark D Ryan	6f5b126e6d	Fix tensor contraction for AVX512 machines This patch modifies the TensorContraction class to ensure that the kc_ field is always a multiple of the packet_size, if the packet_size is > 8. Without this change spatial convolutions in Tensorflow do not work properly as the code that re-arranges the input matrices can assert if kc_ is not a multiple of the packet_size. This leads to a unit test failure, //tensorflow/python/kernel_tests:conv_ops_test, on AVX512 builds of tensorflow.	2018-07-31 09:33:37 +01:00
Gael Guennebaud	678a0dcb12	Merged in ezhulenev/eigen/tiling_3 (pull request PR-438) Tiled tensor executor	2018-07-31 08:13:00 +00:00
Gael Guennebaud	679eece876	Speedup trivial tensor broadcasting on GPU by enforcing unaligned loads. See PR 437.	2018-07-31 10:10:14 +02:00
Eugene Zhulenev	966c2a7bb6	Rename Index to StorageIndex + use Eigen::Array and Eigen::Map when possible	2018-07-27 12:45:17 -07:00
Eugene Zhulenev	6913221c43	Add tiled evaluation support to TensorExecutor	2018-07-25 13:51:10 -07:00
Rasmus Munk Larsen	e478532625	Reduce the number of template specializations of classes related to tensor contraction to reduce binary size.	2018-07-27 12:36:34 -07:00
Christoph Hertzberg	5f79b7f9a9	Removed several shadowing types and use global Index typedef everywhere	2018-07-25 21:47:45 +02:00
Christoph Hertzberg	44ee201337	Rename variable which shadows class name	2018-07-25 20:26:15 +02:00
Eugene Zhulenev	d55efa6f0f	TensorBlockIO	2018-07-23 15:50:55 -07:00
Eugene Zhulenev	34a75c3c5c	Initial support of TensorBlock	2018-07-20 17:37:20 -07:00
Paul Tucker	d4afccde5a	Add test coverage for ThreadPoolDevice optional allocator.	2018-07-19 17:43:44 -07:00
Eugene Zhulenev	c58b874727	PR430: Convert count to the reducer type in MeanReducer Without explicit conversion Tensorflow fails to compile, pset1 template deduction fails. cannot convert '((const Eigen::internal::MeanReducer<Eigen::half>*)this) ->Eigen::internal::MeanReducer<Eigen::half>::packetCount_' (type 'const DenseIndex {aka const long int}') to type 'const type& {aka const Eigen::half&}' return pdiv(vaccum, pset1<Packet>(packetCount_)); Honestly I’m not sure why it works in Eigen tests, because Eigen::half constructor is explicit, and why it stopped working in TF, I didn’t find any relevant changes since previous Eigen upgrade. static_cast<T>(packetCount_) - breaks cxx11_tensor_reductions test for Eigen::half, also quite surprising.	2018-07-19 17:37:03 -07:00
Paul Tucker	4e9848fa86	Actually add optional Allocator* arg to ThreadPoolDevice().	2018-07-16 17:53:36 -07:00
Paul Tucker	b3e7c9132d	Add optional Allocator argument to ThreadPoolDevice constructor. When supplied, this allocator will be used in place of internal::aligned_malloc. This permits e.g. use of a NUMA-node specific allocator where the thread-pool is also restricted a single NUMA-node.	2018-07-16 17:26:05 -07:00
Rasmus Munk Larsen	3a9cf4e290	Get rid of alias for m_broadcast.	2018-07-13 16:24:48 -07:00
Rasmus Munk Larsen	4222550e17	Optimize the case where broadcasting is a no-op.	2018-07-13 16:12:38 -07:00
Gael Guennebaud	1920129d71	Remove clang warning	2018-07-13 16:05:35 +02:00
Gael Guennebaud	06eb24cf4d	Introduce gpu_assert for assertion in device-code, and disable them with clang-cuda.	2018-07-13 16:04:27 +02:00
David Hyde	d908afe35f	bug #1558 : fix a corner case in MINRES when both v_new and w_new vanish.	2018-07-08 22:06:38 -07:00
Eugene Zhulenev	6e654f3379	Reduce number of allocations in TensorContractionThreadPool.	2018-07-16 14:26:39 -07:00
Gael Guennebaud	7ccb623746	bug #1569 : fix Tensor<half>::mean() on AVX with respective unit test.	2018-07-19 13:15:40 +02:00
Eugene Zhulenev	e3c2d61739	Assert that no output kernel is defined for GPU contraction	2018-07-18 14:34:22 -07:00
Eugene Zhulenev	79d4129cce	Specify default output kernel for TensorContractionOp	2018-07-18 14:21:01 -07:00
Gael Guennebaud	b347eb0b1c	Fix doc	2018-07-12 11:56:18 +02:00
Yuefeng Zhou	1eff6cf8a7	Use device's allocate function instead of internal::aligned_malloc. This would make it easier to track memory usage in device instances.	2018-02-20 16:50:05 -08:00
Gael Guennebaud	6cd6551b26	Add deprecated header files for TensorFlow	2018-07-12 10:50:53 +02:00
Deven Desai	876f392c39	Updates corresponding to the latest round of PR feedback The major changes are 1. Moving CUDA/PacketMath.h to GPU/PacketMath.h 2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h 3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h The above three changes effectively enable the Eigen "Packet" layer for the HIP platform 4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic") 5. Updating the "EIGEN_DEVICE_FUNC" marking in some places The change has been tested on the HIP and CUDA platforms.	2018-07-11 10:39:54 -04:00
Deven Desai	471cfe5ff7	renaming CUDA* to GPU* for some header files	2018-07-11 09:22:04 -04:00
Deven Desai	38807a2575	merging updates from upstream	2018-07-11 09:17:33 -04:00
Gael Guennebaud	6190aa5632	bug #1567 : add optimized path for tensor broadcasting and 'Channel First' shape	2018-07-09 11:23:16 +02:00
Deven Desai	1bb6fa99a3	merging the CUDA and HIP implementation for the Tensor directory and the unit tests	2018-06-20 16:44:58 -04:00
Deven Desai	cfdabbcc8f	removing the *Hip files from the unsupported/Eigen/CXX11/src/Tensor and unsupported/test directories	2018-06-20 12:57:02 -04:00
Deven Desai	7e41c8f1a9	renaming Cuda files to Gpu in the unsupported/Eigen/CXX11/src/Tensor and unsupported/test directories	2018-06-20 12:52:30 -04:00
Deven Desai	b6cc0961b1	updates based on PR feedback There are two major changes (and a few minor ones which are not listed here...see PR discussion for details) 1. Eigen::half implementations for HIP and CUDA have been merged. This means that - `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h` - `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h` - `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h` After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install. 2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate. - `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC \|\| EIGEN_HIPCC)` - `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH \|\| EIGEN_HIP_DEVICE_COMPILE)` - `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`	2018-06-14 10:21:54 -04:00
Deven Desai	d1d22ef0f4	syncing this fork with upstream	2018-06-13 12:09:52 -04:00
Benoit Steiner	d3a380af4d	Merged in mfigurnov/eigen/gamma-der-a (pull request PR-403) Derivative of the incomplete Gamma function and the sample of a Gamma random variable Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>	2018-06-11 17:57:47 +00:00
Jonathan Liu	b7689bded9	Use std::complex constructor instead of assignment from scalar Fixes GCC conversion to non-scalar type requested compile error when using boost::multiprecision::cpp_dec_float_50 as scalar type.	2018-06-28 00:32:37 +10:00
Rasmus Munk Larsen	5418154a45	Fix oversharding bug in parallelFor.	2018-06-20 17:51:48 -07:00
Gael Guennebaud	7933267c67	fix prototype	2018-06-08 09:56:01 +02:00
Michael Figurnov	30fa3d0454	Merge from eigen/eigen	2018-06-07 17:57:56 +01:00
Michael Figurnov	6c71c7d360	Merge from eigen/eigen.	2018-06-07 15:54:18 +01:00
Gael Guennebaud	37348d03ae	Fix int versus Index	2018-06-07 15:56:43 +02:00
Michael Figurnov	aa813d417b	Fix compilation of special functions without C99 math. The commit with Bessel functions i0e and i1e placed the ifdef/endif incorrectly, causing i0e/i1e to be undefined when EIGEN_HAS_C99_MATH=0. These functions do not actually require C99 math, so now they are always available.	2018-06-07 14:35:07 +01:00
Gael Guennebaud	b3fd93207b	Fix typos found using codespell	2018-06-07 14:43:02 +02:00
Michael Figurnov	5172a32849	Updated the stopping criteria in igammac_cf_impl. Previously, when computing the derivative, it used a relative error threshold. Now it uses an absolute error threshold. The behavior for computing the value is unchanged. This makes more sense since we do not expect the derivative to often be close to zero. This change makes the derivatives about 30% faster across the board. The error for the igamma_der_a is almost unchanged, while for gamma_sample_der_alpha it is a bit worse for float32 and unchanged for float64.	2018-06-07 12:03:58 +01:00

... 7 8 9 10 11 ...

2320 Commits