![]() |
Testing version 3.9 -- same 84M exponent started from checkpoint -- it is slower than version 3.6.
Timing went from 3.80 to 4.16 ms/it FFT 4608K |
[QUOTE=SELROC;494927]Testing version 3.9 -- same 84M exponent started from checkpoint -- it is slower than version 3.6.
Timing went from 3.80 to 4.16 ms/it FFT 4608K[/QUOTE] Acknowledged. I'll fix soon and post here a note when it's done. |
Mr Preda,
Do you use opencl 2.0 in gpuOwl or is it coded to 1.2 version? Thanks |
[QUOTE=preda;494930]Acknowledged. I'll fix soon and post here a note when it's done.[/QUOTE]
Started new 86M exponents, with both 3.6 and 3.9 versions. The default FFT 4608K is too small, I have encountered errors and retries at around 20% (it is 18.x bits/word). Now I have restarted the computation with -fft +1 which is 5120K and all seems to proceed. |
[QUOTE=SELROC;495737]Started new 86M exponents, with both 3.6 and 3.9 versions.
The default FFT 4608K is too small, I have encountered errors and retries at around 20% (it is 18.x bits/word). Now I have restarted the computation with -fft +1 which is 5120K and all seems to proceed.[/QUOTE] I had no issue using 4608K FFT for 86.5M exponents on version 3.8 on my AMD GPU. |
[QUOTE=xx005fs;495772]I had no issue using 4608K FFT for 86.5M exponents on version 3.8 on my AMD GPU.[/QUOTE]
There is a point where the fft becomes too small, that is after 86.5M |
I just ran some PRP via tests via gpuOwl to completion and got results that look like this:
{"exponent":86524913, "worktype":"PRP-3", "status":"C", "program":{"name":"gpuowl", "version":"3.9-OpenCL"}, "timestamp":"2018-09-09 08:15:59 UTC", "computer":"Ellesmere-36x1266-@9:0.0", "aid":"C6C41CB2A726CF8BF01B808ED6BAF3A1", "residue-type":1, "fft-length":"5120K", "res64":"c373d39ff37389ec", "errors":{"gerbicz":0}} What was my result, anyway? Does the status:C mean it is composite ? How do I reformat this to submit to PrimeNet? Thanks |
[QUOTE=tServo;495780]I just ran some PRP via tests via gpuOwl to completion and got results that look like this:
{"exponent":86524913, "worktype":"PRP-3", "status":"C", "program":{"name":"gpuowl", "version":"3.9-OpenCL"}, "timestamp":"2018-09-09 08:15:59 UTC", "computer":"Ellesmere-36x1266-@9:0.0", "aid":"C6C41CB2A726CF8BF01B808ED6BAF3A1", "residue-type":1, "fft-length":"5120K", "res64":"c373d39ff37389ec", "errors":{"gerbicz":0}} What was my result, anyway? Does the status:C mean it is composite ? How do I reformat this to submit to PrimeNet? Thanks[/QUOTE] Yes "C" is composite, "P" is prime. Also the res64 would have a particular form for a prime.. I think for "type 1" the res64 for a prime would be 1. Submit the JSON as is (e.g. cut-paste or file upload) to the manual results page, it parses that syntax. |
[QUOTE=tServo;495709]Mr Preda,
Do you use opencl 2.0 in gpuOwl or is it coded to 1.2 version? Thanks[/QUOTE] It's mostly OpenCL 1.x, but it does use some aspects of the "memory model" that were introduced in OpenCL 2.0; related to memory synchronization between groups, used mostly in the kernel carryFused() I think. It was plain 1.x not so long ago, but that stopped working on ROCm (because of the memory model), so switched to requiring 2.x at that point. |
[QUOTE=preda;494930]Acknowledged. I'll fix soon and post here a note when it's done.[/QUOTE]
I moved to the new ROCm 1.9, and I can't reproduce this on my GPUs (Vega & FuryX). I don't know if the slowdown is still affecting 580. Let's keep an eye on it in the future. |
[QUOTE=preda;496442]I moved to the new ROCm 1.9, and I can't reproduce this on my GPUs (Vega & FuryX). I don't know if the slowdown is still affecting 580. Let's keep an eye on it in the future.[/QUOTE]
Hello Mihai, I am testing version 4.1 from yesterday, yes it is still slower by around 0.30 ms/it. The performance is better in version 3.6, this should help you track down the modifications that made the slowdown. |
| All times are UTC. The time now is 23:07. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.