[Maxima] Fast conversion to bfloat?
toy.raymond at gmail.com
Wed Dec 9 11:14:17 CST 2009
R Fateman wrote:
> I'm pretty such that one can easily compute the number of extra bits
> needed to do this all in floating point.
> If the target precision (total) is N, and we need to compute 10^k to
> precision N, we would use about log(k) multiplications.
> So we would get epsilon*k maximum roundoff error, where epsilon is one
> unit in last place.
> Carry that many digits: log(k).
I've done some further tests using log2(k) extra bits of precision.
This seems to work and produces the same results as before if some care
is used to compute everything with extra precision and then round the
final result to the target precision. Also, it seems to work better to
compute 10^(-k) as 1/10^k. Without that, I get 1-bit difference for the
tests I've tried. I've currently set up the code so that the fast
conversion is switchable, and only applies for exponents that are larger
in magnitude than some threshold.
After a bit more testing, I'll check in this code.
More information about the Maxima