**in a row**, so you'd think you'd spend too much time converting back and forth.

However, as it turns out, absolutely no conversions are needed! The initial random parameters are supposed to be uniform over [1, N), so there's absolutely no reason to convert them - the result would be another uniform random variable over [1, N). So we'll just generate random parameters as usual, and treat them as if they were of the form

*. Then in the algorithm you replace all multiplications with Montgomery multiplication.*

**aR mod N**I don't know exactly how the end result is still correct, but it I think just so happens to be because you are working with

*and R and N are coprime, thus not affecting the algorithm logic which checks*

**aR mod N***.*

**gcd(aR mod N, N) = 1**Either way, here's some code implementing this (sorry, only GCC on x86-64 because I need inline assembly): http://coliru.stacked-crooked.com/a/f57f11426d06acd8

This trick made the implementation 4 times faster.