Mihai, there is a way to reduce that number of bits, but the multiple becomes such big, that would be impossible to use, as you will need soooOOO many more squarings, you gain nothing. For example, instead of computing b^e (with odd e), you can compute b^(2^ord(2,e))/b. What's in parenthesis is a power of 2, so it has no additional "pops", and the resulted power (after division) is a multiple of e (by definition of ord). But what you get there is HUUGE, way beyond b^(2^p2) that you compute for PRP, haha.
