![]() |
sr2sieve 1.7.6
This version adds the optimisation from 1.7.5 to non-asm and PPC64 builds. It also has a tweak to the 32-bit SSE2 build that improves the Pentium 4 times a little more.
Note that the changes in 1.7.5 might cause a slowdown if there are less than 4 sequences in the sieve, and 8-16 sequences might be needed before the gains become noticeable. |
Using the 17k SoB-PSP .dat I get a marginal improvement, 2-4 % perhaps, on my dualcore Opteron using 1.7.5-linux-x86-64 @ p=1.36T
|
Overall performance increases on Core2 and Core/Pentium M architectures regardless of OS used. The only performance hit I've noticed was on A64 platform, where 1.6.18 is 12% faster than 1.7.6 on 12k sieve (each k<300) @ p>20T. Sieve is running on 32-bit WinXP Pro.
|
[QUOTE=Cruelty;123352]The only performance hit I've noticed was on A64 platform, where 1.6.18 is 12% faster than 1.7.6 on 12k sieve (each k<300) @ p>20T. Sieve is running on 32-bit WinXP Pro.[/QUOTE]
Can you check that this is not due to other processes running in the background? Version 1.7.x doesn't have CPU-time reporting yet (due to compilcations with multithreading), so the times it reports will be lower if something else is using the CPU. You can check by running version 1.6.x with the -e switch for comparison. If that is not the problem and you have time, then it could also be useful to see whether the slowdown occurs for the non-SSE2 code: run both versions with the --no-sse2 switch. However, I don't really want to spend a lot of time optimising the 32-bit code for 64-bit machines, so there might not be an improvement for this case anytime soon. |
[quote=geoff;123452]...
However, I don't really want to spend a lot of time optimising the 32-bit code for 64-bit machines, so there might not be an improvement for this case anytime soon.[/quote] Are you saying that the 64-bit code actually will run on 32-bit machines? If so, then would that mean I could run 64-bit sieving, and thus benefit from the huge speedup on my Core 2 Duo, which is 64-bit compatible, but is running 32-bit Ubuntu? |
[QUOTE=Anonymous;123471]Are you saying that the 64-bit code actually will run on 32-bit machines? If so, then would that mean I could run 64-bit sieving, and thus benefit from the huge speedup on my Core 2 Duo, which is 64-bit compatible, but is running 32-bit Ubuntu?[/QUOTE]
No, I am saying that while 32-bit code can run on 64-bit machines, it would be much better for the user to install a 64-bit OS and run the 64-bit code. Optimising the SSE2 code for one machine type often makes it slower on others, and it is better that the 32-bit code be optimised for those machines that don't have the option of running a 64-bit OS. |
[url=http://www.geocities.com/g_w_reynolds/sr5sieve/tests/powmod-bench.zip]Here[/url] are Linux and Windows executables to benchmark the four different 32-bit x86 powmod functions used in sr2sieve 1.7.6.
I am hoping to get some idea of the relative speed of the functions on different SSE2-capable machines. It would be a help if anyone with access to Athlon 64, Pentium M, Core 1, or late model (post-Prescott) Pentium 4 could run the program on an otherwise idle machine and report the output. For bit level i-j the figures indicate average clock cycles per bit to compute b^n (mod p) with 2^i < n,p < 2^j. Here are results for my machines: Pentium 4 (Northwood C) [code] Bit scalar scalar vector vector level i386 sse2 i386 sse2 ----- ------ ------ ------ ------ 58-59 190.4 128.1 89.8 37.6 59-60 190.3 127.8 90.0 37.6 60-61 191.0 127.8 90.3 37.6 61-62 193.5 127.2 94.1 37.4 [/code] Core 2 Duo (Conroe E6750) [code] Bit scalar scalar vector vector level i386 sse2 i386 sse2 ----- ------ ------ ------ ------ 58-59 60.1 57.2 29.8 20.8 59-60 60.3 57.2 29.8 20.8 60-61 60.6 57.1 30.5 20.8 61-62 61.9 57.1 32.3 20.7 [/code] |
[quote=geoff;123609]No, I am saying that while 32-bit code can run on 64-bit machines, it would be much better for the user to install a 64-bit OS and run the 64-bit code. Optimising the SSE2 code for one machine type often makes it slower on others, and it is better that the 32-bit code be optimised for those machines that don't have the option of running a 64-bit OS.[/quote]
Maybe I phrased my question wrong. Actually, what I was wondering is, can I run 64-bit sr(x)sieve on a 32-bit OS as long as I have a 64-bit-compatible CPU? :smile: |
[QUOTE=Anonymous;123617]Maybe I phrased my question wrong. Actually, what I was wondering is, can I run 64-bit sr(x)sieve on a 32-bit OS as long as I have a 64-bit-compatible CPU? :smile:[/QUOTE]No : you need to have a 64 bits capable CPU to run a 64 bits OS. And you need a 64 bits OS to run 64 bits programs.
Jacob |
[quote=S485122;123622]No : you need to have a 64 bits capable CPU to run a 64 bits OS. And you need a 64 bits OS to run 64 bits programs.
Jacob[/quote] I have a 64-bit capable CPU (Core 2 Duo) but a 32-bit OS, so I guess that means I'd be able to run 64 bit programs, but only if I upgrade to a 64-bit OS first. Okay, I get it now. Thanks! :smile: |
both of these running [B][U][COLOR="Red"]64bit linux[/U][/B][/COLOR], I suppose the timings are useful even though I'm not running a 32bit OS since you said the code was 32bit.
[code]model name : Dual Core AMD Opteron(tm) Processor 165 stepping : 2 cpu MHz : 2250.017 cache size : 1024 KB Bit scalar scalar vector vector level i386 sse2 i386 sse2 ----- ------ ------ ------ ------ 58-59 65.6 67.3 34.7 31.1 59-60 67.3 66.1 34.5 31.2 60-61 67.5 66.1 34.8 31.2 61-62 68.1 66.1 35.5 31.1 model name : AMD Athlon(tm) 64 Processor 3000+ stepping : 0 cpu MHz : 2430.686 cache size : 512 KB Bit scalar scalar vector vector level i386 sse2 i386 sse2 ----- ------ ------ ------ ------ 58-59 67.3 66.2 34.6 31.2 59-60 67.3 66.2 34.6 31.1 60-61 67.5 66.4 34.9 31.2 61-62 68.1 66.3 35.5 31.1[/code] |
| All times are UTC. The time now is 05:55. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.