## Costs of arithmetic over various rings.

In the field of Real numbers, we can evaluate a rational function (in fact, any algebraic function) to*n*bits in

*O(n log n log log n)*time. In the ring

*Z_{2^n}*, we can likewise evaluate a rational function in

*O(n log n log log n)*time. In the field

*Z_p*, where

*p*is an

*n*-bit prime, however, the best we can do is

*O(n (log n)^2 log log n)*-- slower by a factor of

*log n*. Why?

The obvious answer is that the multiplicative inverse -- the hard step
in computing a rational function -- can be computed via a Newton
iteration (also known as p-adic lifting and Hensel lifting) over the
Real numbers and *Z_{2^n}*, whereas inversion modulo a prime
requires use of a GCD algorithm in some form. Inversion modulo a large
prime, rational reconstruction (over the Real numbers or over
*Z_m*), and computing elementary functions are all problems which
"live" in the domain of continued fractions instead of polynomials, and
they pay the price of an extra factor of *log n* for that
generality.

I'm not altogether satisfied with this explanation; I get the feeling
that there must be something more subtle involved. In the case of
computing algebraic functions, note that this is easy over the Real
numbers but hard modulo a large composite -- in fact, RSA depends upon
this -- but of greater significance, even determining if a root exists
is hard. The same applies somewhat to inversion: You can determine if
an inverse exists over the Real numbers trivially (is the value
non-zero?), over *Z_{2^n}* trivially (is the value odd?), but over
*Z_m* you need to determine relative primality. That said,
determining if an inverse exists modulo a prime is also trivial, so
this is not merely a case of "solving a problem is at least as hard as
determining if a solution exists", but my intuition suggests that
there's something along those lines involved.

While this is a rather indistinct train of thought, it comes out of a
very concrete problem. I'm looking at algorithms for arithmetic of
RSA-sized values as part of my work towards a side
channel free cryptographic library, and there's a curious dilemma:
Using the FFT, arithmetic modulo *2^n + 1* is twice as fast as
arithmetic modulo *2^n*, so I'd prefer to use the first ring when
performing Montgomery multiplication; but the initial setup requires
computing an inverse over the ring, which is much faster for the second
ring. I think the best approach might be to work with one ring for
private key operations and the other for public key operations, due to
the difference in the relative importance of overhead costs.