View Single Post
Old 2020-03-07, 00:39   #4
kuratkull
 
kuratkull's Avatar
 
Mar 2007
Estonia

100001102 Posts
Default

Yes, I also tried it on larger N's today and it seems that the current version lags behind more and more at larger N's compared to LLR64. I'm not sure why, because during development I saw faster speeds than LLR64 in some cases. The weird thing is that the core loop is basically just calling the square function in GWnum - this function call takes 99.99% of the time in the loop. I am looking into it with the goal of at least matching the speed of LLR64, because it should be doable due to the fact that the core loop of LLR64 is exactly the same.

Though you might be surprised to see that LLR64 is also slower with smaller N's in threaded mode(i mean the same smallish N [for example 1*2^859433-1] is slower on multithreaded than single thread, both on RPT and LLR64). I just discovered this today when benchmarking the slowness issue. This is probably due to small N's taking too much CPU time context switching. Larger N's should benefit from multithreaded, while small N's don't.


PS! N = k*2^n-1 (N equals the digits in a number)

Last fiddled with by kuratkull on 2020-03-07 at 00:54
kuratkull is offline   Reply With Quote