![]() |
[QUOTE=preda;494513]About 2days. Depending on cooling/frequency, I get between 2.05 - 2.15 ms/it.[/QUOTE]
On RX580 it is 4 days and some hour, timing 3.50-4.00 |
[QUOTE=SELROC;494514]On RX580 it is 4 days and some hour, timing 3.50-4.00[/QUOTE]
Did you try FFT variants, e.g. passing "-fft +1" or "-fft +2" and see which is fastest. Edit: Never mind, at wavefront the default FFT should be the fastest. |
[QUOTE=preda;494516]Did you try FFT variants, e.g. passing "-fft +1" or "-fft +2" and see which is fastest.[/QUOTE]
Not yet. But it is coming. I am currently fighting amdgpu errors, and an unfortunate thing, that the latest version of amdgpu driver (18.30) does not install on Debian. I wrote to the amd community in the hope to get help, [URL]https://community.amd.com/thread/206833[/URL] (it is possible that my messages do not appear yet, they are being moderated) And I have attempted to use ROCm with Ubuntu 18.04 but apparently they don't support my hardware. |
[QUOTE=SELROC;494519]Not yet. But it is coming.[/QUOTE]
Without the -fft option: [CODE]FFT 4608K: Width 512 (64x8), Height 512 (64x8), Middle 9; 17.94 bits/word Note: using short carry kernels Ellesmere-36x1360-@2:0.0 Radeon RX 580 Series OpenCL compilation in 2989 ms, with "-DEXP=84674323u -DWIDTH=512u -DSMALL_HEIGHT=512u -DMIDDLE=9u -I. -cl-fast-relaxed-math -cl-std=CL2.0 " [2018-08-23 15:25:02 CEST] PRP M(84674323), FFT 4608K, 17.94 bits/word OK loaded: 72354800/84674323, blockSize 400, c05550b6947ecc38 OK initial check: c05550b6947ecc38 OK 2018-08-23 15:25:13 0 72355600/84674323 [85.45%], 3.84 ms/it [3.83, 3.85]; ETA 0d 13:09; 361312417c909cfa (check 2.37s) (saved) Stopping, please wait.. OK 2018-08-23 15:28:08 0 72400000/84674323 [85.50%], 3.90 ms/it [3.89, 3.97]; ETA 0d 13:17; 553db60311ce3007 (check 2.33s) (saved) Bye [/CODE]With -fft +1: [CODE]gpuowl-OpenCL 3.6--mod FFT 5120K: Width 1024 (256x4), Height 512 (64x8), Middle 5; 16.15 bits/word Note: using short carry kernels Ellesmere-36x1360-@2:0.0 Radeon RX 580 Series OpenCL compilation in 1005 ms, with "-DEXP=84674323u -DWIDTH=1024u -DSMALL_HEIGHT=512u -DMIDDLE=5u -I. -cl-fast-relaxed-math -cl-std=CL2.0 " [2018-08-23 15:28:48 CEST] PRP M(84674323), FFT 5120K, 16.15 bits/word OK loaded: 72400000/84674323, blockSize 400, 553db60311ce3007 OK initial check: 553db60311ce3007 OK 2018-08-23 15:29:01 0 72400800/84674323 [85.50%], 4.29 ms/it [4.28, 4.30]; ETA 0d 14:38; 16610202fac94885 (check 2.56s) (saved) Stopping, please wait.. OK 2018-08-23 15:29:07 0 72401600/84674323 [85.51%], 4.42 ms/it [4.31, 4.53]; ETA 0d 15:03; 64ae36294a2a5e46 (check 2.56s) (saved) Bye [/CODE]With -fft +2: [CODE]gpuowl-OpenCL 3.6--mod FFT 4096K: Width 4096 (512x8), Height 512 (64x8); 20.19 bits/word FFT size too small for exponent (20.19 bits/word). gpuowl-OpenCL 3.6--mod FFT 5120K: Width 512 (64x8), Height 1024 (256x4), Middle 5; 16.15 bits/word Note: using short carry kernels Ellesmere-36x1360-@2:0.0 Radeon RX 580 Series OpenCL compilation in 879 ms, with "-DEXP=84674323u -DWIDTH=512u -DSMALL_HEIGHT=1024u -DMIDDLE=5u -I. -cl-fast-relaxed-math -cl-std=CL2.0 " [2018-08-23 15:30:15 CEST] PRP M(84674323), FFT 5120K, 16.15 bits/word OK loaded: 72401600/84674323, blockSize 400, 64ae36294a2a5e46 OK initial check: 64ae36294a2a5e46 OK 2018-08-23 15:30:28 0 72402400/84674323 [85.51%], 4.71 ms/it [4.70, 4.72]; ETA 0d 16:04; 29cd7a69cd8234dd (check 2.72s) (saved) Stopping, please wait.. OK 2018-08-23 15:31:58 0 72420800/84674323 [85.53%], 4.74 ms/it [4.73, 4.75]; ETA 0d 16:08; 6f0303e6cb8819b6 (check 2.71s) (saved) Bye[/CODE] With -fft -1 gpuowl aborts, bits/word > 20 |
Valerio, about TF not working on amdgpu-pro: I wouldn't worry that much, a bit more or less TF is unlikely to make any significant difference. In the meantime you can either use mfakto if you want to deepen the TF, or better just skip the TF altogether and jump right into PRP :)
|
[QUOTE=preda;494556]Valerio, about TF not working on amdgpu-pro: I wouldn't worry that much, a bit more or less TF is unlikely to make any significant difference. In the meantime you can either use mfakto if you want to deepen the TF, or better just skip the TF altogether and jump right into PRP :)[/QUOTE]
I am appreciating the speed of my gpus with mfakto. In a few days I returned something like 500 results, and found more than 10 factors. On my cpu, dual-core hyperthreaded, it takes some hour to complete a trial factoring run, while mfakto takes approx. 13 minutes to complete a run. |
PRP speed
[QUOTE=preda;494513]About 2days. Depending on cooling/frequency, I get between 2.05 - 2.15 ms/it.[/QUOTE]
I use a utility called amdcovc on Linux to push the memory speed higher and thus achieving higher memory clock speed, which increased the performance on my Vega 56 quite a lot. From 2.15 on Vega 64 liquid BIOS stock to like 1.9 ms/it. |
[QUOTE=xx005fs;494626]I use a utility called amdcovc on Linux to push the memory speed higher and thus achieving higher memory clock speed, which increased the performance on my Vega 56 quite a lot. From 2.15 on Vega 64 liquid BIOS stock to like 1.9 ms/it.[/QUOTE]
Nice! I'll try it out. One good thing about PRP is that it reliably detects if one pushes the overclock too far. |
[QUOTE=preda;494635]Nice! I'll try it out. One good thing about PRP is that it reliably detects if one pushes the overclock too far.[/QUOTE]
The problem about amdcovc is that there is currently no voltage control and thus I run GpuOwL on windows because using WattMan I could tune the efficiency of my Vega GPU as it is way more efficient on windows after tuning than running it on stock voltage on linux. |
GpuOwL wrong speed reporting???
I recently also realized that on my system when there is at least 1 thread of my CPU running at 100%, the vega GPU's report speed will increase, usually between 0.02 to 0.06 ms/it. For example, when I am running prime95 simultaneously with GpuOwL I realized that my speed increased from 2.00 ms/it to 1.94 ms/it. I know the difference is minute but does that have to do with the GPU architecture or is there just a bug in the software. Also, would this issue be resolved if I turn on some sort of system timer like HPET?
|
[QUOTE=xx005fs;494659]I recently also realized that on my system when there is at least 1 thread of my CPU running at 100%, the vega GPU's report speed will increase, usually between 0.02 to 0.06 ms/it. For example, when I am running prime95 simultaneously with GpuOwL I realized that my speed increased from 2.00 ms/it to 1.94 ms/it. I know the difference is minute but does that have to do with the GPU architecture or is there just a bug in the software. Also, would this issue be resolved if I turn on some sort of system timer like HPET?[/QUOTE]
Yes, I saw the same thing. IMO it looks like a problem in the GPU driver area. It may be that the transition from some CPU sleep state to active is slower when the CPU is not busy. |
| All times are UTC. The time now is 23:06. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.