![]() |
different fft sizes
I have been running two instances at 5632K. Now I have a new set of assignments at 6144K. Running two different FFT sizes has two effects.: The smaller runs faster and the larger runs much slower. It is more efficient to run one instance. Will this imbalance be redressed when I have equal 6144K assignments?
|
[QUOTE=paulunderwood;544616]I have been running two instances at 5632K. Now I have a new set of assignments at 6144K. Running two different FFT sizes has two effects.: The smaller runs faster and the larger runs much slower. It is more efficient to run one instance. Will this imbalance be redressed when I have equal 6144K assignments?[/QUOTE]
One would presume so. BTW, the latest commit supports exponents up to 106.6M in the 5.5M FFT. |
[QUOTE=Prime95;544617]One would presume so. BTW, the latest commit supports exponents up to 106.6M in the 5.5M FFT.[/QUOTE]
[CODE]./gpuowl ./gpuowl: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by ./gpuowl) [/CODE] I don't have GLIBCXX_3.4.26 on my Debian Buster -- is there a work-around? |
[QUOTE=paulunderwood;544620][CODE]./gpuowl
./gpuowl: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by ./gpuowl) [/CODE] I don't have GLIBCXX_3.4.26 on my Debian Buster -- is there a work-around?[/QUOTE] I have two compilers. I think I installed gcc 9 manually for the source and 8 is native. Anyway for my purposes I hard wired g++-8 into the make file and all is hunky dory now. |
[QUOTE=kriesel;544376]Quick update / recap on that[/QUOTE]
For what it's worth, this matching LL DC with shift 0 was performed in gpuowl v6.11-264 on the RX480 in the same system as was previously having frequent GEC errors on rx550s. [url]https://www.mersenne.org/report_exponent/?exp_lo=55487503&full=1[/url] |
[QUOTE=paulunderwood;544616]I have been running two instances at 5632K. Now I have a new set of assignments at 6144K. Running two different FFT sizes has two effects.: The smaller runs faster and the larger runs much slower. It is more efficient to run one instance. Will this imbalance be redressed when I have equal 6144K assignments?[/QUOTE]
Paul, what expo(s) are you running that need 6144K? I'd be interested to see your per-job timings for the following 2-job setups: 1. Both @5632K; 2. One each @5632K and @6144K; 3. Both @6144K. Presumably you already know the timing for the first 2 ... you could temporarily move one of your queued-up 6144K assignments to top of the worktodo file for the current 5632K run to get both @6144K. If the slowdown for 2 compared to 1 and 3 really is as bad as you describe, I wonder if it's something to do with context-switching on the GPU between tasks that have different memory mappings: 2 jobs at same FFT length have different run data and e.g. DWT weights but have the same memory profile and GPU resources usage. [b]Edit:[/b] I tried the above three 2-jobs scenarios on my own Radeon7, using expos ~107M to trigger the 6M FFT length. Here are the per-iteration timings: 1. Both @5632K: 1470 us/iter for each, total throughput 1360 iter/sec; 2. One each @5632K,@6144K: 1530,1546 us/iter resp., total throughput 1300 iter/sec; 3. Both @6144K: 1615 us/iter for each, total throughput 1238 iter/sec. So no anomalous slowdowns for me at any of these combos, and the per-iteration timings hew very closely to what one would expect based on an n*log(n) per-autosquaring scaling. |
[QUOTE=ewmayer;544664]Paul, what expo(s) are you running that need 6144K? I'd be interested to see your per-job timings for the following 2-job setups:
1. Both @5632K; 2. One each @5632K and @6144K; 3. Both @6144K. Presumably you already know the timing for the first 2 ... you could temporarily move one of your queued-up 6144K assignments to top of the worktodo file for the current 5632K run to get both @6144K. If the slowdown for 2 compared to 1 and 3 really is as bad as you describe, I wonder if it's something to do with context-switching on the GPU between tasks that have different memory mappings: 2 jobs at same FFT length have different run data and e.g. DWT weights but have the same memory profile and GPU resources usage. [b]Edit:[/b] I tried the above three 2-jobs scenarios on my own Radeon7, using expos ~107M to trigger the 6M FFT length. Here are the per-iteration timings: 1. Both @5632K: 1470 us/iter for each, total throughput 1360 iter/sec; 2. One each @5632K,@6144K: 1530,1546 us/iter resp., total throughput 1300 iter/sec; 3. Both @6144K: 1615 us/iter for each, total throughput 1238 iter/sec. So no anomalous slowdowns for me at any of these combos, and the per-iteration timings hew very closely to what one would expect based on an n*log(n) per-autosquaring scaling.[/QUOTE] 1. Both @5632K; ---> 1489us/it each 2. One each @5632K and @6144K ----> the latter was ~2300us/it (very slow); the former 1125us/it At the moment (with latest commit) it is running ~1200us/it (103.9M) and ~1800us/it (104.9M). They were running at the average earlier until I restarted them. It is my last 103.9M exponent. |
Was ~1200us/it (103.9M) and ~1800us/it (104.9M).
Now 1440us/it each -- both at 104.9M. |
PM1 Result not understood
It seems that for factored PM1 results out of GPUOWL, primenet won't be able to understand it.
[CODE]{"status":"F", "exponent":"98141611", "worktype":"PM1", "B1":"750000", "B2":"15000000", "fft-length":"5767168", "factors":"["****"]", "program":{"name":"gpuowl", "version":"v6.11-258-gb92cdfd"}, "computer":"TITAN V-0", "aid":"******", "timestamp":"2020-05-06 07:29:29 UTC"}[/CODE] |
[QUOTE=paulunderwood;544778]Was ~1200us/it (103.9M) and ~1800us/it (104.9M).
Now 1440us/it each -- both at 104.9M.[/QUOTE]"us" ? Usually it is capitalised as "US", but it is not a unit (AFAIK.) Or do you (and preceding posters) mean µs ? Jacob |
[QUOTE=S485122;544825]"us" ? Usually it is capitalised as "US", but it is not a unit (AFAIK.) Or do you (and preceding posters) mean µs ?
Jacob[/QUOTE] Yes, I meant µs. But how do I generate mu with the keyboard easily? nvm: I found [URL="https://www.maketecheasier.com/quickly-type-special-characters-linux/"]this on how to do it Gnome[/URL] without having to remember and use unicodes. Thanks for prompting me! |
| All times are UTC. The time now is 23:05. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.