Antonio Sanchez
|
f47472603b
|
Add missing header for GPU tests.
|
2023-01-09 11:21:13 -08:00 |
|
Charles Schlosser
|
81172cbdcb
|
Overhaul Sparse Core
|
2023-01-07 22:09:42 +00:00 |
|
Robin Miquel
|
9255181891
|
Modified spbenchsolver help message because it could be misunderstood
|
2023-01-07 21:35:46 +00:00 |
|
Chip Kerchner
|
d20fe21ae4
|
Improve performance for Power10 MMA bfloat16 GEMM
|
2023-01-06 23:08:37 +00:00 |
|
Ryan Senanayake
|
fe7f527787
|
Fix guard macros for emulated FP16 operators on GPU
|
2023-01-06 22:02:51 +00:00 |
|
Rasmus Munk Larsen
|
b8422c99cd
|
Update file jacobisvd.cpp
|
2023-01-06 21:14:17 +00:00 |
|
Antonio Sánchez
|
262194f12c
|
Fix a bunch of minor build and test issues.
|
2023-01-06 16:37:26 +00:00 |
|
Antonio Sánchez
|
3564668908
|
Fix overalign check.
|
2023-01-05 17:10:48 +00:00 |
|
Charles Schlosser
|
f3929ac7ed
|
Fix EIGEN_HAS_CXX17_OVERALIGN for icc
|
2023-01-03 17:30:10 +00:00 |
|
LAI Bruce
|
1b33a6374b
|
Fixes git add . doesn't include scripts/buildtests.in
|
2023-01-03 17:06:36 +00:00 |
|
Charles Schlosser
|
a8bab0d8ae
|
Patch SparseLU
|
2022-12-31 04:52:36 +00:00 |
|
Antonio Sánchez
|
910f6f65d0
|
Adjust thresholds for bfloat16 product tests that are currently failing.
|
2022-12-28 19:32:25 +00:00 |
|
Arthur
|
311cc0f9cc
|
Enable NEON pcmp, plset, and complex psqrt
|
2022-12-22 05:38:34 +00:00 |
|
Antonio Sánchez
|
dbf7ae6f9b
|
Fix up C++ version detection macros and cmake tests.
|
2022-12-20 18:06:03 +00:00 |
|
Antonio Sánchez
|
bb6675caf7
|
Fix incorrect NEON native fp16 multiplication.
|
2022-12-19 20:46:44 +00:00 |
|
Rasmus Munk Larsen
|
dd85d26946
|
Revert "Avoid mixing types in CompressedStorage.h"
|
2022-12-19 20:09:37 +00:00 |
|
Arthur Feeney
|
c4fb6af24b
|
Enable NEON pabs for unsigned int types
|
2022-12-19 17:07:36 +00:00 |
|
Rasmus Munk Larsen
|
400bc5cd5b
|
Add sparse_basic_1 to smoke tests.
|
2022-12-16 22:03:33 +00:00 |
|
Rasmus Munk Larsen
|
04e4f0bb24
|
Add missing colon in SparseMatrix.h.
|
2022-12-16 21:50:00 +00:00 |
|
Rasmus Munk Larsen
|
3d8a8def8a
|
Avoid mixing types in CompressedStorage.h
|
2022-12-16 20:11:02 +00:00 |
|
Charles Schlosser
|
4bb2446796
|
Add operators to CompressedStorageIterator
|
2022-12-16 16:48:50 +00:00 |
|
Rasmus Munk Larsen
|
e1aee4ab39
|
Update test of numext::signbit.
|
2022-12-15 19:58:59 +00:00 |
|
Rasmus Munk Larsen
|
3717854a21
|
Use numext::signbit instead of std::signbit, which is not defined for bfloat16.
|
2022-12-15 18:41:46 +00:00 |
|
Alexander Richardson
|
37de432907
|
Avoid using std::raise() for divide by zero
|
2022-12-14 20:06:16 +00:00 |
|
Alexander Richardson
|
62de593c40
|
Allow std::initializer_list constructors in constexpr expressions
|
2022-12-14 17:05:37 +00:00 |
|
Charles Schlosser
|
6d3e3678b4
|
optimize equalspace packetop
|
2022-12-13 01:22:25 +00:00 |
|
Charles Schlosser
|
2004831941
|
add EqualSpaced / setEqualSpaced
|
2022-12-13 00:54:57 +00:00 |
|
Melven Roehrig-Zoellner
|
273f803846
|
Add BDCSVD_LAPACKE binding
|
2022-12-09 18:50:12 +00:00 |
|
Antonio Sánchez
|
03c9b4738c
|
Enable direct access for NestByValue.
|
2022-12-07 18:21:45 +00:00 |
|
Chip Kerchner
|
b59f18b4f7
|
Increase L2 and L3 cache size for Power10.
|
2022-12-07 18:20:33 +00:00 |
|
Antonio Sánchez
|
c614b2bbd3
|
Fix index type for sparse index sorting.
|
2022-12-06 00:02:31 +00:00 |
|
Charles Schlosser
|
44fe539150
|
add sparse sort inner vectors function
|
2022-12-01 19:28:56 +00:00 |
|
Lianhuang Li
|
d194167149
|
Fix the bug using neon instruction fmla for data type half
|
2022-12-01 17:28:57 +00:00 |
|
Pedro Caldeira
|
31ab62d347
|
Add support for Power10 (AltiVec) MMA instructions for bfloat16.
|
2022-11-30 23:33:37 +00:00 |
|
Antonio Sánchez
|
dcb042a87d
|
Fix serialization for non-compressed matrices.
|
2022-11-30 18:16:47 +00:00 |
|
Antonio Sánchez
|
2260e11eb0
|
Fix reshape strides when input has non-zero inner stride.
|
2022-11-29 19:39:29 +00:00 |
|
Alexandre Hoffmann
|
23524ab6fc
|
Changing BiCGSTAB parameters initialization so that it works with custom types
|
2022-11-29 19:37:46 +00:00 |
|
Antonio Sánchez
|
ab2b26fbc2
|
Fix sparseLU solver when destination has a non-unit stride.
|
2022-11-29 19:37:03 +00:00 |
|
Antonio Sánchez
|
551eebc8ca
|
Add synchronize method to all devices.
|
2022-11-29 19:35:02 +00:00 |
|
Charles Schlosser
|
b7551bff92
|
Fix a bunch of annoying compiler warnings in tests
|
2022-11-21 20:07:19 +00:00 |
|
Antonio Sánchez
|
e7b1ad0315
|
Add serialization for sparse matrix and sparse vector.
|
2022-11-21 19:43:07 +00:00 |
|
Charles Schlosser
|
044f3f6234
|
Fix bug in handmade_aligned_realloc
|
2022-11-18 22:35:31 +00:00 |
|
Chris
|
6728683938
|
Small cleanup of IDRS.h
|
2022-11-16 13:51:23 +00:00 |
|
Charles Schlosser
|
02805bd56c
|
Fix AVX2 psignbit
|
2022-11-16 13:43:11 +00:00 |
|
Chip Kerchner
|
399ce1ed63
|
Fix duplicate execution code for Power 8 Altivec in pstore_partial.
|
2022-11-16 13:41:42 +00:00 |
|
Gabriele Buondonno
|
6431dfdb50
|
Cross product for vectors of size 2. Fixes #1037
|
2022-11-15 22:39:42 +00:00 |
|
Antonio Sánchez
|
8588d8c74b
|
Correct pnegate for floating-point zero.
|
2022-11-15 18:07:23 +00:00 |
|
Antonio Sanchez
|
5eacb9e117
|
Put brackets around unsigned type names.
|
2022-11-15 09:09:45 -08:00 |
|
Antonio Sánchez
|
37e40dca85
|
Fix ambiguity in PPC for vec_splats call.
|
2022-11-14 18:58:16 +00:00 |
|
Antonio Sánchez
|
7dc6db75d4
|
Fix typo in CholmodSupport
|
2022-11-08 23:49:56 +00:00 |
|