![]() |
[quote=TheJudger;228207]Which variant do you prefer?[/quote]
The latter. Windows console windows are 80 chars wide by default, and the current style lines are slightly longer (at least for me). |
I like the new style.
|
[QUOTE=TheJudger;228094]
Find attached mfaktc 0.11. :smile: [/QUOTE] Oliver - I took a look at your sieving code and there might be a small change that will help improve performance. I notice the following: within your sieve_init_class you are calculating ii and jj within the loop. [CODE] ii=(2ULL * (exp%p) * (k_start%p))%p; jj=(2ULL * (exp%p) * (NUM_CLASSES%p))%p; [/CODE] It does not appear that these values will change once they have been calculated. If you calculate them just before the loop you should save time with the repeated calculations - of which some of them are mods. Just an observation - do with it what you will. Grant. |
Hi Grant,
[QUOTE=gjmccrac;228263]Oliver - I took a look at your sieving code and there might be a small change that will help improve performance. I notice the following: within your sieve_init_class you are calculating ii and jj within the loop. [CODE] ii=(2ULL * (exp%p) * (k_start%p))%p; jj=(2ULL * (exp%p) * (NUM_CLASSES%p))%p; [/CODE] It does not appear that these values will change once they have been calculated. If you calculate them just before the loop you should save time with the repeated calculations - of which some of them are mods. Just an observation - do with it what you will. Grant.[/QUOTE] The p changes on every iteration of the loop (first line within the loop): [CODE]p=primes[i];[/CODE] I didn't spent much time on optimizing sieve_init_class(), it is only called once per class and doesn't take much time. The performance-critical function is sieve_candidates() which might (due to lack of documentation :rolleyes:) no so easy to understand. Oliver |
[QUOTE=TheJudger;228265]
The p changes on every iteration of the loop (first line within the loop): [CODE]p=primes[i];[/CODE] [/QUOTE] I missed that :blush: |
There seem to be an awful lot of modulo operations in here:[code] ii=(2ULL * (exp%p) * (k_start%p))%p;
jj=(2ULL * (exp%p) * (NUM_CLASSES%p))%p;[/code]Would it be quicker to store 2ULL * (exp%p) in another variable? Do you really need to take those values modulo p in every step? Surely doing it just once at the end would eat fewer cycles. Assuming of course that 2ULL * exp * (k_start|NUM_CLASSES) will fit in whatever data type you're using to hold it.[code] t=2ULL * exp; ii=(t * k_start)%p; jj=(t * NUM_CLASSES)%p;[/code]Or does the compiler take care of stuff like this? Edit: If only p changes in the loop, then you could actually do this instead:[code] ii=2ULL * exp * k_start; jj=2ULL * exp * NUM_CLASSES; while(whocares){ ii_loop=ii%p; jj_loop=jj%p; }[/code] |
Hi,
[QUOTE=lavalamp;228330]Do you really need to take those values modulo p in every step? Surely doing it just once at the end would eat fewer cycles. Assuming of course that 2ULL * exp * (k_start|NUM_CLASSES) will fit in whatever data type you're using to hold it.[/QUOTE] thats the point, it doesn't fit. exp goes up to the biggest prime below 2^32 and k_start is allready a 64bit integer which goes up to ~2^63. I could save the modulo on NUM_CLASSES... I won't spent much time optimizing on this function, if you make it 10 times faster you won't recognize it on a regular run. Oliver P.S. If you can tell me how to speedup sieve_candidates() I would be very happy! |
[QUOTE=TheJudger;228342]Hi,
thats the point, it doesn't fit. exp goes up to the biggest prime below 2^32 and k_start is allready a 64bit integer which goes up to ~2^63. I could save the modulo on NUM_CLASSES... I won't spent much time optimizing on this function, if you make it 10 times faster you won't recognize it on a regular run. Oliver P.S. If you can tell me how to speedup sieve_candidates() I would be very happy![/QUOTE] I have an idea... :smile: Did you already check how I implemented the sieve on Factor5? Luigi |
[QUOTE=TheJudger;228342]P.S. If you can tell me how to speedup sieve_candidates() I would be very happy![/QUOTE]I've downloaded the code, but not yet dug into it.
Even when I do, don't hold your breath waiting for any improvements I might bring to it. |
Hi Luigi,
[QUOTE=ET_;228349]I have an idea... :smile: Did you already check how I implemented the sieve on Factor5? Luigi[/QUOTE] please give me a hint, I've taken a look on your sieve code. You split the factor candidates into classes (mod 60 which is 2 * 2 * 3 * 5), I do the same except that I'm using classes mod 420 or 4620 (2 * 2 * 3 * 5 * 7 (* 11)). This removes multiples of 2, 3, 5 (, 7(, 11)) and candidates which are not +/- 1 mod 8 totally out of the sieve. You're using segments for sieving, your segment size is 272272 (16 * 7 * 11 * 13 * 17), I'm doing the same, my segments are N * 11 * 13 * 17 * 19 (for mod 420 classes) or N * 13 * 17 * 19 * 23 (for mod 4620 classes) where N is choosen in that way that the segment size fits into to L1-cache of the CPU (SIEVE_SIZE_LIMIT in params.h). This allows presieved segments (presieve up to 17 in your implementation and 19 or 23 in my code). Differences: you need a whole 32bit integers per factor candidate while I use a single bit per candidate in the sieve. [B] Unless I've missed something[/B] our both sieve implementations are basically the same. Oliver |
[QUOTE=TheJudger;228711]Hi Luigi,
[B] Unless I've missed something[/B] our both sieve implementations are basically the same. Oliver[/QUOTE] No way, yours is more memory-effective! :smile: My question was about "sieving" itself. I didn't have the time to look in deep at your code. The whole process of sieving after the initialization consists in clearing bits, and testing those that were not cleared. Am I correct? Luigi |
| All times are UTC. The time now is 22:55. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.