Antonio Sanchez 070d303d56 Add CUDA complex sqrt.
This is to support scalar `sqrt` of complex numbers `std::complex<T>` on
device, requested by Tensorflow folks.

Technically `std::complex` is not supported by NVCC on device
(though it is by clang), so the default `sqrt(std::complex<T>)` function only
works on the host. Here we create an overload to add back the
functionality.

Also modified the CMake file to add `--relaxed-constexpr` (or
equivalent) flag for NVCC to allow calling constexpr functions from
device functions, and added support for specifying compute architecture for
NVCC (was already available for clang).
2020-12-22 23:25:23 -08:00
..
2020-12-04 21:45:09 +00:00
2019-03-14 10:08:12 +01:00
2016-05-18 14:03:03 +02:00
2020-12-22 23:25:23 -08:00
2018-11-23 15:37:09 +01:00
2020-12-22 23:25:23 -08:00
2020-12-22 23:25:23 -08:00
2019-12-11 18:22:57 +00:00
2020-12-03 11:27:32 -08:00
2020-06-20 19:16:24 +00:00
2019-01-15 10:51:03 +01:00
2018-11-23 15:12:06 +01:00