* normalize left Jacobi rotations to avoid having to swap rows
* set precision to 2*machine_epsilon instead of machine_epsilon, we lose 1 bit of precision
but gain between 10% and 100% speed, plus reduce the risk that some day we hit a bad matrix
where it's impossible to approach machine precision