![]() |
[QUOTE=kriesel;498379]I saw that 5-6 ms/it stated in post 730.
V4.6 is what percentage faster, than what other version, v3.8? Or alternately, what is the timing per iteration for one of them, and which version is that for. (On an RX580, or what gpu model?)[/QUOTE] I wrote the figures of program output, it should be easy to calculate the percentage. the first figures 5-6 ms/it faster, is an early reading of the program output, 5-8 ms/it faster is read later on. I confirm, now and forever that my hardware is Radeon RX580. |
[QUOTE=SELROC;498408]I wrote the figures of program output, it should be easy to calculate the percentage.
the first figures 5-6 ms/it faster, is an early reading of the program output, 5-8 ms/it faster is read later on. I confirm, now and forever that my hardware is Radeon RX580.[/QUOTE] Ok so the timings are: v4.0, 4.8 ms/it v4.6, 4.4 ms/it with FFT 5120K Probably I should write decimals of ms/it but well, it is clear that we are talking about splitting the hair, and results may vary from system to system. |
[QUOTE=kriesel;498379]I saw that 5-6 ms/it stated in post 730.
V4.6 is what percentage faster, than what other version, v3.8? Or alternately, what is the timing per iteration for one of them, and which version is that for. (On an RX580, or what gpu model?)[/QUOTE] Large number test: [CODE]2018-10-22 10:56:04 3 gpuowl 4.6--mod 2018-10-22 10:56:04 3 FFT 73728K: Width 2048 (256x8), Height 2048 (256x8), Middle 9; 13.25 bits/word 2018-10-22 10:56:04 3 Note: using long carry kernels 2018-10-22 10:56:04 3 Ellesmere-36x1360-@6:0.0 Radeon RX 580 Series 2018-10-22 10:56:05 3 OpenCL compilation in 1020 ms, with "-DEXP=1000000001u -DWIDTH=2048u -DSMALL_HEIGHT=2048u -DMIDDLE=9u -I. -cl-fast-relaxed-math -cl-std=CL2.0 " 2018-10-22 10:56:10 3 PRP M(1000000001), FFT 73728K, 13.25 bits/word, B1 0 2018-10-22 10:56:51 3 OK loaded: 0/1000000001, B1 0, blockSize 400, 0000000000000003 (expected 0000000000000003) 2018-10-22 10:56:51 3 Selected 0 P-1 trial points 2018-10-22 10:58:21 3 OK 800/1000000001 [ 0.00%], 70.93 ms/it [70.87, 70.99]; ETA 820d 22:05; b70bd0429f585b7f (check 33.15s) 2018-10-22 11:06:52 3 Stopping, please wait.. 2018-10-22 11:07:25 3 OK 8000/1000000001 [ 0.00%], 71.02 ms/it [70.96, 71.70]; ETA 822d 00:15; e8cbaa94ad3015eb (check 33.45s) 2018-10-22 11:07:25 3 Starting GCD over 0 points 2018-10-22 11:07:28 3 Waiting for GCD to finish.. 2018-10-22 11:07:28 3 Exiting because "stop requested" 2018-10-22 11:07:28 3 Bye[/CODE] |
[QUOTE=SELROC;498468]Large number test:
[CODE]2018-10-22 10:56:04 3 gpuowl 4.6--mod 2018-10-22 10:56:04 3 FFT 73728K: Width 2048 (256x8), Height 2048 (256x8), Middle 9; 13.25 bits/word 2018-10-22 10:56:04 3 Note: using long carry kernels 2018-10-22 10:56:04 3 Ellesmere-36x1360-@6:0.0 Radeon RX 580 Series 2018-10-22 10:56:05 3 OpenCL compilation in 1020 ms, with "-DEXP=1000000001u -DWIDTH=2048u -DSMALL_HEIGHT=2048u -DMIDDLE=9u -I. -cl-fast-relaxed-math -cl-std=CL2.0 " 2018-10-22 10:56:10 3 PRP M(1000000001), FFT 73728K, 13.25 bits/word, B1 0 2018-10-22 10:56:51 3 OK loaded: 0/1000000001, B1 0, blockSize 400, 0000000000000003 (expected 0000000000000003) 2018-10-22 10:56:51 3 Selected 0 P-1 trial points 2018-10-22 10:58:21 3 OK 800/1000000001 [ 0.00%], 70.93 ms/it [70.87, 70.99]; ETA 820d 22:05; b70bd0429f585b7f (check 33.15s) 2018-10-22 11:06:52 3 Stopping, please wait.. 2018-10-22 11:07:25 3 OK 8000/1000000001 [ 0.00%], 71.02 ms/it [70.96, 71.70]; ETA 822d 00:15; e8cbaa94ad3015eb (check 33.45s) 2018-10-22 11:07:25 3 Starting GCD over 0 points 2018-10-22 11:07:28 3 Waiting for GCD to finish.. 2018-10-22 11:07:28 3 Exiting because "stop requested" 2018-10-22 11:07:28 3 Bye[/CODE][/QUOTE] M(1000000001) does not exist, as 1000000001 = 7x11x13x19x52579 is not prime. It looks that the program does not test the primality of the exponent of the "Mersenne" number... |
[QUOTE=ET_;498480]M(1000000001) does not exist, as 1000000001 = 7x11x13x19x52579 is not prime.
It looks that the program does not test the primality of the exponent of the "Mersenne" number...[/QUOTE] I said it is a test, with an arbitrary large number, but we ll you should ask Preda for that. |
[QUOTE=SELROC;498487]I said it is a test, with an arbitrary large number, but we ll you should ask Preda for that.[/QUOTE]
I was. :smile: Your test was legit. |
[QUOTE=ET_;498489]I was. :smile:
Your test was legit.[/QUOTE] Ok, there are a couple of bugs that have endured various versions: 1. FFT selection, sometimes selects FFT size too small for the exponent. 2. GpuOwl output with -h does not show program version. If we want to know which version is the executable, we must necessarily start a computation only to see the version number. |
[QUOTE=SELROC;498493]Ok, there are a couple of bugs that have endured various versions:
1. FFT selection, sometimes selects FFT size too small for the exponent. 2. GpuOwl output with -h does not show program version. If we want to know which version is the executable, we must necessarily start a computation only to see the version number.[/QUOTE] 2. I'll fix the -h to show version, yes it's a good suggestion. 1. FFT selection: to help me fix it, please can you tell me which exponents produce FFT too small? (or other data you have) |
[QUOTE=ET_;498480]M(1000000001) does not exist, as 1000000001 = 7x11x13x19x52579 is not prime.
It looks that the program does not test the primality of the exponent of the "Mersenne" number...[/QUOTE] I could add a test for the primality of the exponent.. but is that really needed? I mean, usually the user would obtain the exponent from some source, and then it's prime. Otherwise if somebody just wants to do a test (with a random non-prime value), why not? I think there are some asserts that would trigger on an even exponent. |
[QUOTE=kriesel;498401]exponents 335M and below that were tried had nonzero ghz-day totals indicated, and so nonzero ghz-day/day rates computed.[/QUOTE]
I'm removing the ghz-day/day display, probably that zero wan an intermediary step. |
[QUOTE=kriesel;498400][CODE]ken@condorella MINGW64 ~/gpuowl-compile/v4.6
$ g++ -std=c++17 -DREV=\"bb691cb\" -O2 -c Worktodo.cpp Result.cpp common.cpp gpuowl.cpp Gpu.cpp clwrap.cpp Task.cpp checkpoint.cpp timeutil.cpp Args.cpp GCD.cpp Primes.cpp Stats.cpp state.cpp -lOpenCL -lgmp -pthread Gpu.cpp:19:28: error: static assertion failed: size long static_assert(sizeof(long) == 8, "size long"); ~~~~~~~~~~~~~^~~~ [/CODE][/QUOTE] OK, I think I just removed that assert (and the reason for it) in a recent commit. |
| All times are UTC. The time now is 23:08. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.