eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-10-18 11:01:28 +08:00

Author	SHA1	Message	Date
Deven Desai	471cfe5ff7	renaming CUDA* to GPU* for some header files	2018-07-11 09:22:04 -04:00
Deven Desai	ba972fb6b4	moving Half headers from CUDA dir to GPU dir, removing the HIP versions	2018-06-13 12:26:18 -04:00
Michael Figurnov	4bd158fa37	Derivative of the incomplete Gamma function and the sample of a Gamma random variable. In addition to igamma(a, x), this code implements: * igamma_der_a(a, x) = d igamma(a, x) / da -- derivative of igamma with respect to the parameter * gamma_sample_der_alpha(alpha, sample) -- reparameterization derivative of a Gamma(alpha, 1) random variable sample with respect to the alpha parameter The derivatives are computed by forward mode differentiation of the igamma(a, x) code. Although gamma_sample_der_alpha can be implemented via igamma_der_a, a separate function is more accurate and efficient due to analytical cancellation of some terms. All three functions are implemented by a method parameterized with "mode" that always computes the derivatives, but does not return them unless required by the mode. The compiler is expected to (and, based on benchmarks, does) skip the unnecessary computations depending on the mode.	2018-06-06 18:49:26 +01:00
Michael Figurnov	f216854453	Exponentially scaled modified Bessel functions of order zero and one. The functions are conventionally called i0e and i1e. The exponentially scaled version is more numerically stable. The standard Bessel functions can be obtained as i0(x) = exp(\|x\|) i0e(x) The code is ported from Cephes and tested against SciPy.	2018-05-31 15:34:53 +01:00
Gael Guennebaud	e43ca0320d	bug #1520 : workaround some -Wfloat-equal warnings by calling std::equal_to	2018-04-11 15:24:13 +02:00
Daniel Trebbien	0c57be407d	Move up the specialization of std::numeric_limits This fixes a compilation error seen when building TensorFlow on macOS: https://github.com/tensorflow/tensorflow/issues/17067	2018-02-18 15:35:45 -08:00
nluehr	aefd5fd5c4	Replace __float2half_rn with __float2half The latter provides a consistent definition for CUDA 8.0 and 9.0.	2017-11-28 10:15:46 -08:00
nluehr	dd6de618c3	Fix incorrect integer cast in predux<half2>(). Bug corrupts results on Maxwell and earlier GPU architectures.	2017-11-21 10:47:00 -08:00
Henry Schreiner	9bb26eb8f1	Restore `__device__`	2017-10-21 00:50:38 +00:00
Henry Schreiner	4245475d22	Fixing missing inlines on device functions for newer CUDA cards	2017-10-20 03:20:13 +00:00
Gael Guennebaud	9c353dd145	Add C++11 max_digits10 for half.	2017-09-06 10:22:47 +02:00
Benoit Steiner	a4089991eb	Added support for CUDA 9.0.	2017-08-31 02:49:39 +00:00
Gael Guennebaud	21633e585b	bug #1462 : remove all occurences of the deprecated __CUDACC_VER__ macro by introducing EIGEN_CUDACC_VER	2017-08-24 11:06:47 +02:00
Gael Guennebaud	bbd97b4095	Add a EIGEN_NO_CUDA option, and introduce EIGEN_CUDACC and EIGEN_CUDA_ARCH aliases	2017-07-17 01:02:51 +02:00
Benoit Steiner	3baef62b9a	Added missing __device__ qualifier	2017-06-13 12:56:55 -07:00
Benoit Steiner	449936828c	Added missing __device__ qualifier	2017-06-13 12:54:57 -07:00
Gael Guennebaud	26f552c18d	fix compilation of Half in C++98 (issue introduced in previous commit)	2017-06-09 13:36:58 +02:00
Gael Guennebaud	d588822779	Add missing std::numeric_limits specialization for half, and complete NumTraits<half>	2017-06-09 11:51:53 +02:00
Abhijit Kundu	9bc0a35731	Fixed nested angle barckets >> issue when compiling with cuda 8	2017-04-27 03:09:03 -04:00
Gael Guennebaud	e958c2baac	remove UTF8 symbols	2017-03-07 10:47:40 +01:00
Benoit Steiner	7b61944669	Made most of the packet math primitives usable within CUDA kernel when compiling with clang	2017-02-28 17:05:28 -08:00
Benoit Steiner	34d9fce93b	Avoid unecessary float to double conversions.	2017-02-27 16:33:33 -08:00
Srinivas Vasudevan	f7d7c33a28	Fix expm1 CUDA implementation (do not shadow exp CUDA implementation).	2016-12-05 12:19:01 -08:00
Srinivas Vasudevan	09ee7f0c80	Fix small nit where I changed name of plog1p to pexpm1.	2016-12-02 15:30:12 -08:00
Srinivas Vasudevan	218764ee1f	Added support for expm1 in Eigen.	2016-12-02 14:13:01 -08:00
Rasmus Munk Larsen	a0329f64fb	Add a default constructor for the "fake" __half class when not using the __half class provided by CUDA.	2016-11-29 13:18:09 -08:00
Benoit Steiner	dff9a049c4	Optimized the computation of exp, sqrt, ceil anf floor for fp16 on Pascal GPUs	2016-11-16 09:01:51 -08:00
Benoit Steiner	c80587c92b	Merged eigen/eigen into default	2016-11-03 03:55:11 -07:00
Benoit Steiner	7a0e96b80d	Gate the code that refers to cuda fp16 primitives more thoroughly	2016-11-01 12:08:09 -07:00
Benoit Steiner	38b6048e14	Deleted redundant implementation of predux	2016-10-12 14:37:56 -07:00
Benoit Steiner	78d2926508	Merged eigen/eigen into default	2016-10-12 13:46:29 -07:00
Benoit Steiner	2e2f48e30e	Take advantage of AVX512 instructions whenever possible to speedup the processing of 16 bit floats.	2016-10-12 13:45:39 -07:00
Benoit Steiner	d485d12c51	Added missing AVX intrinsics for fp16: in particular, implemented predux which is required by the matrix-vector code.	2016-10-06 10:41:03 -07:00
Benoit Steiner	698ff69450	Properly characterize the CUDA packet primitives for fp16 as device only	2016-10-04 16:53:30 -07:00
Benoit Steiner	409e887d78	Added support for constand std::complex numbers on GPU	2016-10-03 11:06:24 -07:00
Benoit Steiner	26f9907542	Added missing typedefs	2016-09-20 12:58:03 -07:00
RJ Ryan	b2c6dc48d9	Add CUDA-specific std::complex<T> specializations for scalar_sum_op, scalar_difference_op, scalar_product_op, and scalar_quotient_op.	2016-09-20 07:18:20 -07:00
Gael Guennebaud	8f4b4ad5fb	use ::hlog if available.	2016-08-29 11:05:32 +02:00
Gael Guennebaud	35a8e94577	bug #1167 : simplify installation of header files using cmake's install(DIRECTORY ...) command.	2016-08-29 10:59:37 +02:00
Gael Guennebaud	d937a420a2	Fix compilation with MSVC by using our portable numext::log1p implementation.	2016-08-22 15:44:21 +02:00
Igor Babuschkin	59bacfe520	Fix compilation on CUDA 8 by removing call to h2log1p	2016-08-15 23:38:05 +01:00
Igor Babuschkin	aee693ac52	Add log1p support for CUDA and half floats	2016-08-08 20:24:59 +01:00
Benoit Steiner	fe778427f2	Fixed the constructors of the new half_base class.	2016-08-04 18:32:26 -07:00
Benoit Steiner	9506343349	Fixed the isnan, isfinite and isinf operations on GPU	2016-08-04 17:25:53 -07:00
Gael Guennebaud	17b9a55d98	Move Eigen::half_impl::half to Eigen::half while preserving the free functions to the Eigen::half_impl namespace together with ADL	2016-08-04 00:00:43 +02:00
Benoit Steiner	02fe89f5ef	half implementation has been moved to half_impl namespace	2016-07-29 15:09:34 -07:00
Christoph Hertzberg	c5b893f434	bug #1266 : half implementation has been moved to half_impl namespace	2016-07-29 18:36:08 +02:00
Gael Guennebaud	395c835f4b	Fix CUDA compilation	2016-07-22 15:30:24 +02:00
Gael Guennebaud	47afc9a365	More cleaning in half: - put its definition and functions in its own half_impl namespace such that the free function does not polute the Eigen namespace while still making them visible for half through ADL. - expose Eigen::half throguh a using statement - move operator<< from std to half_float namespace	2016-07-22 14:33:28 +02:00
Gael Guennebaud	0f350a8b7e	Fix CUDA compilation	2016-07-21 18:47:07 +02:00

1 2 3 4

176 Commits