mirror of
https://gitlab.com/libeigen/eigen.git
synced 2025-05-24 05:27:35 +08:00

This seems to be the recommended approach for doing type punning in CUDA. See for example - https://stackoverflow.com/questions/47037104/cuda-type-punning-memcpy-vs-ub-union - https://developer.nvidia.com/blog/faster-parallel-reductions-kepler/ (the latter puns a double to an `int2`). The issue is that for CUDA, the `memcpy` is not elided, and ends up being an expensive operation. We already have similar `reintepret_cast`s across the Eigen codebase for GPU (as does TensorFlow).
Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.
For more information go to http://eigen.tuxfamily.org/.
For pull request, bug reports, and feature requests, go to https://gitlab.com/libeigen/eigen.
Languages
C++
85.1%
Fortran
8.5%
C
2.8%
CMake
1.9%
Cuda
1.2%
Other
0.4%