eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-06-29 17:55:13 +08:00

Author	SHA1	Message	Date
Eugene Zhulenev	8c2f30c790	Speedup Tensor ThreadPool RunQueu::Empty()	2019-02-13 10:20:53 -08:00
Eugene Zhulenev	21eb97d3e0	Add PacketConv implementation for non-vectorizable src expressions	2019-02-08 15:47:25 -08:00
Eugene Zhulenev	1e36166ed1	Optimize TensorConversion evaluator: do not convert same type	2019-02-08 15:13:24 -08:00
Steven Peters	953ca5ba2f	Spline.h: fix spelling "spang" -> "span"	2019-02-08 06:23:24 +00:00
Eugene Zhulenev	59998117bb	Don't do parallel_pack if we can use thread_local memory in tensor contractions	2019-02-07 09:21:25 -08:00
Eugene Zhulenev	8491127082	Do not reduce parallelism too much in contractions with small number of threads	2019-02-04 12:59:33 -08:00
Eugene Zhulenev	eb21bab769	Parallelize tensor contraction only by sharding dimension and use 'thread-local' memory for packing	2019-02-04 10:43:16 -08:00
Gael Guennebaud	d586686924	Workaround lack of support for arbitrary packet-type in Tensor by manually loading half/quarter packets in tensor contraction mapper.	2019-01-30 16:48:01 +01:00
Christoph Hertzberg	a7779a9b42	Hide some annoying unused variable warnings in g++8.1	2019-01-29 16:48:21 +01:00
Christoph Hertzberg	c9825b967e	Renaming even more `I` identifiers	2019-01-26 13:22:13 +01:00
Christoph Hertzberg	934b8a1304	Avoid `I` as an identifier, since it may clash with the C-header complex.h	2019-01-25 14:54:39 +01:00
Eugene Zhulenev	1e6d15b55b	Fix shorten-64-to-32 warning in TensorContractionThreadPool	2019-01-11 11:41:53 -08:00
Eugene Zhulenev	0abe03764c	Fix shorten-64-to-32 warning in TensorContractionThreadPool	2019-01-10 10:27:55 -08:00
Gael Guennebaud	d812f411c3	bug #1654 : fix compilation with cuda and no c++11	2019-01-09 18:00:05 +01:00
Eugene Zhulenev	e70ffef967	Optimize evalShardedByInnerDim	2019-01-08 16:26:31 -08:00
Rasmus Munk Larsen	dd6d65898a	Fix shorten-64-to-32 warning. Use regular memcpy if num_threads==0.	2018-12-12 14:45:31 -08:00
Gael Guennebaud	cf697272e1	Remove debug code.	2018-12-09 23:05:46 +01:00
Gael Guennebaud	450dc97c6b	Various fixes in polynomial solver and its unit tests: - cleanup noise in imaginary part of real roots - take into account the magnitude of the derivative to check roots. - use <= instead of < at appropriate places	2018-12-09 22:54:39 +01:00
Rasmus Munk Larsen	8a02883d58	Merged in markdryan/eigen/avx512-contraction-2 (pull request PR-554) Fix tensor contraction on AVX512 builds Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>	2018-12-05 18:19:32 +00:00
Mark D Ryan	36f8f6d0be	Fix evalShardedByInnerDim for AVX512 builds evalShardedByInnerDim ensures that the values it passes for start_k and end_k to evalGemmPartialWithoutOutputKernel are multiples of 8 as the kernel does not work correctly when the values of k are not multiples of the packet_size. While this precaution works for AVX builds, it is insufficient for AVX512 builds where the maximum packet size is 16. The result is slightly incorrect float32 contractions on AVX512 builds. This commit fixes the problem by ensuring that k is always a multiple of the packet_size if the packet_size is > 8.	2018-12-05 12:29:03 +01:00
Christoph Hertzberg	0ec8afde57	Fixed most conversion warnings in MatrixFunctions module	2018-11-20 16:23:28 +01:00
Rasmus Munk Larsen	72928a2c8a	Merged in rmlarsen/eigen2 (pull request PR-543) Add parallel memcpy to TensorThreadPoolDevice in Eigen, but limit the number of threads to 4, beyond which we just seem to be wasting CPU cycles as the threads contend for memory bandwidth. Approved-by: Eugene Zhulenev <ezhulenev@google.com>	2018-11-13 17:10:30 +00:00
Rasmus Munk Larsen	cda479d626	Remove accidental changes.	2018-11-12 18:34:04 -08:00
Rasmus Munk Larsen	719d9aee65	Add parallel memcpy to TensorThreadPoolDevice in Eigen, but limit the number of threads to 4, beyond which we just seem to be wasting CPU cycles as the threads contend for memory bandwidth.	2018-11-12 17:46:02 -08:00
Rasmus Munk Larsen	93f9988a7e	A few small fixes to a) prevent throwing in ctors and dtors of the threading code, and b) supporting matrix exponential on platforms with 113 bits of mantissa for long doubles.	2018-11-09 14:15:32 -08:00
Christoph Hertzberg	449ff74672	Fix most Doxygen warnings. Also add links to stable documentation from unsupported modules (by using the corresponding Doxytags file). Manually grafted from d107a371c61b764c73fd1570b1f3ed1c6400dd7e	2018-10-19 21:10:28 +02:00
Rasmus Munk Larsen	dda68f56ec	Fix GPU build due to gpu_assert not always being defined.	2018-10-18 16:29:29 -07:00
Eugene Zhulenev	9e96e91936	Move from rvalue arguments in ThreadPool enqueue* methods	2018-10-16 16:48:32 -07:00
Eugene Zhulenev	217d839816	Reduce thread scheduling overhead in parallelFor	2018-10-16 14:53:06 -07:00
Rasmus Munk Larsen	d52763bb4f	Merged in ezhulenev/eigen-02 (pull request PR-528) [TensorBlockIO] Check if it's allowed to squeeze inner dimensions Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>	2018-10-16 15:39:40 +00:00
Eugene Zhulenev	900c7c61bb	Check if it's allowed to squueze inner dimensions in TensorBlockIO	2018-10-15 16:52:33 -07:00
Gael Guennebaud	f0fb95135d	Iterative solvers: unify and fix handling of multiple rhs. m_info was not properly computed and the logic was repeated in several places.	2018-10-15 23:47:46 +02:00
Gael Guennebaud	2747b98cfc	DGMRES: fix null rhs, fix restart, fix m_isDeflInitialized for multiple solve	2018-10-15 23:46:00 +02:00
Christoph Hertzberg	3f2c8b7ff0	Fix a lot of Doxygen warnings in Tensor module	2018-10-09 20:22:47 +02:00
Rasmus Munk Larsen	d16634c4d4	Fix out-of bounds access in TensorArgMax.h.	2018-10-08 16:41:36 -07:00
Gael Guennebaud	64b1a15318	Workaround stupid warning	2018-10-08 12:01:18 +02:00
Christoph Hertzberg	b92c71235d	Move struct outside of method for C++03 compatibility.	2018-10-02 18:59:10 +02:00
Christoph Hertzberg	051f9c1aff	Make code compile in C++03 mode again	2018-10-02 18:36:30 +02:00
Christoph Hertzberg	b786ce8c72	Fix conversion warning ... again	2018-10-02 18:35:25 +02:00
Christoph Hertzberg	564ca71e39	Merged in deven-amd/eigen/HIP_fixes (pull request PR-518) PR with HIP specific fixes (for the eigen nightly regression failures in HIP mode)	2018-10-01 16:51:04 +00:00
Deven Desai	94898488a6	This commit contains the following (HIP specific) updates: - unsupported/Eigen/CXX11/src/Tensor/TensorReductionGpu.h Changing "pass-by-reference" argument to be "pass-by-value" instead (in a __global__ function decl). "pass-by-reference" arguments to __global__ functions are unwise, and will be explicitly flagged as errors by the newer versions of HIP. - Eigen/src/Core/util/Memory.h - unsupported/Eigen/CXX11/src/Tensor/TensorContraction.h Changes introduced in recent commits breaks the HIP compile. Adding EIGEN_DEVICE_FUNC attribute to some functions and calling ::malloc/free instead of the corresponding std:: versions to get the HIP compile working again - unsupported/Eigen/CXX11/src/Tensor/TensorReduction.h Change introduced a recent commit breaks the HIP compile (link stage errors out due to failure to inline a function). Disabling the recently introduced code (only for HIP compile), to get the eigen nightly testing going again. Will submit another PR once we have te proper fix. - Eigen/src/Core/util/ConfigureVectorization.h Enabling GPU VECTOR support when HIP compiler is in use (for both the host and device compile phases)	2018-10-01 14:28:37 +00:00
Rasmus Munk Larsen	2088c0897f	Merged eigen/eigen into default	2018-09-28 16:00:46 -07:00
Rasmus Munk Larsen	31629bb964	Get rid of unused variable warning.	2018-09-28 16:00:09 -07:00
Eugene Zhulenev	bb13d5d917	Fix bug in copy optimization in Tensor slicing.	2018-09-28 14:34:42 -07:00
Rasmus Munk Larsen	104e8fa074	Fix a few warnings and rename a variable to not shadow "last".	2018-09-28 12:00:08 -07:00
Rasmus Munk Larsen	7c1b47840a	Merged in ezhulenev/eigen-01 (pull request PR-514) Add tests for evalShardedByInnerDim contraction + fix bugs	2018-09-28 18:37:54 +00:00
Eugene Zhulenev	524c81f3fa	Add tests for evalShardedByInnerDim contraction + fix bugs	2018-09-28 11:24:08 -07:00
Christoph Hertzberg	86ba50be39	Fix integer conversion warnings	2018-09-28 19:33:39 +02:00
Eugene Zhulenev	e95696acb3	Optimize TensorBlockCopyOp	2018-09-27 14:49:26 -07:00
Eugene Zhulenev	9f33e71e9d	Revert code lost in merge	2018-09-27 12:08:17 -07:00

1 2 3 4 5 ...

2108 Commits