![]() |
My guess would be drivers. (mfakto on my i5-4670k also uses a full cpu core, while on a i3-8100 it uses almost no resources)
|
[QUOTE=kracker;559378]mfakto on my ... i3-8100[/QUOTE]Thanks for posting that -- my "other" computer has an i3-8100 and I didn't know it could run mfakto. :redface:
Free extra 21Ghd/d, thanks! :smile: |
Radeon VII - poor performance against promised numbers
It was probably discussed earlier, but can anyone tell me, why can't I squeeze more than 1300 GHz-D/D (2.6 TFLOPS) from Radeon VII, despite it promising 13.8 TFLOPS (about 10 % more than RTX 2080Ti) of SP throughput?
Also in gpuOwl, only about 0.75 TFLOPS despite AMD promising 3.46 TFLOPS. It seems to always be a fifth of what it should be according to the [URL="https://www.amd.com/en/products/graphics/amd-radeon-vii"]official AMD page[/URL]. |
[QUOTE=Viliam Furik;559728]It was probably discussed earlier, but can anyone tell me, why can't I squeeze more than 1300 GHz-D/D (2.6 TFLOPS) from Radeon VII, despite it promising 13.8 TFLOPS (about 10 % more than RTX 2080Ti) of SP throughput?[/QUOTE]I'm not sure how you're calculating FLOPS performance in mfakto/gpuowl?
The mfakto performance you quote seems to be in line with expected [url=https://www.mersenne.ca/mfaktc.php?filter=Radeon%20VII|RTX%202080%20Ti]mfakto performance[/url] (noting that my chart shows stock-clock performance). Also note AMD's somewhat deceptive practice of quoting "peak performance" numbers, which mean performance at 1750 boost instead of 1400 stock. The numbers on my pages are all stock-clock values. |
[QUOTE=Viliam Furik;559728]It was probably discussed earlier, but can anyone tell me, why can't I squeeze more than 1300 GHz-D/D (2.6 TFLOPS) from Radeon VII, despite it promising 13.8 TFLOPS (about 10 % more than RTX 2080Ti) of SP throughput?
Also in gpuOwl, only about 0.75 TFLOPS despite AMD promising 3.46 TFLOPS. It seems to always be a fifth of what it should be according to the [URL="https://www.amd.com/en/products/graphics/amd-radeon-vii"]official AMD page[/URL].[/QUOTE]If you're using mfakto as the measure of sp performance, note that mfakto & mfaktc use int32, not fp32. Also the AMD spec sheet says "peak". Gpuowl is memory bandwidth constrained per Mihai. Which would make sustained rate difficult to maintain at peak rate. [URL]https://www.amd.com/en/products/graphics/amd-radeon-vii[/URL] [URL]https://www.techpowerup.com/gpu-specs/radeon-vii.c3358[/URL] |
[QUOTE=kriesel;559734]If you're using mfakto as the measure of sp performance, note that mfakto & mfaktc use int32, not fp32[/QUOTE]
This! Radeon VII has a poor integer performance. |
[QUOTE=James Heinrich;559730]I'm not sure how you're calculating FLOPS performance in mfakto/gpuowl?[/QUOTE]
He's using standard Primenet conversion 1Gd/d = 2GFLOPS |
lsgpu
A small utility to list the OpenCL platforms and devices on them and a bit of description.
See [URL="https://www.mersenneforum.org/showpost.php?p=488474&postcount=6"]https://www.mersenneforum.org/showpo...74&postcount=6[/URL] Could help the occasional mfakto installation startup. |
After switching to Windows 10, I got mfakto to run on my ASRock Deskmini A300W, AMD A8-9600, 16GB DDR-4, SSD.
[CODE]Selftest statistics number of tests 34026 successful tests 34026 selftest PASSED! mfakto 0.15pre7-MGW (64bit build) OpenCL device info name Bristol Ridge (Advanced Micro Devices, Inc.) device (driver) version OpenCL 2.0 AMD-APP (2841.19) (2841.19) maximum threads per block 1024 maximum threads per grid 1073741824 number of multiprocessors 6 (384 compute elements) clock rate 900 MHz Automatic parameters threads per grid 0 optimizing kernels for GCN Loading binary kernel file mfakto_Kernels.elf Compiling kernels. GPUSievePrimes (adjusted) 81206 GPUsieve minimum exponent 1037054 Started a simple selftest ... Selftest statistics number of tests 30 successful tests 30 selftest PASSED! got assignment: exp=212335483 bit_min=73 bit_max=74 (9.01 GHz-days) Starting trial factoring M212335483 from 2^73 to 2^74 (9.01 GHz-days) Using GPU kernel "cl_barrett15_74_gs_2" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Dec 05 23:13 | 4617 100.0% | 11.685 0m00s | 69.39 81206 0.00% no factor for M212335483 from 2^73 to 2^74 [mfakto 0.15pre7-MGW cl_barrett15_74_gs_2] tf(): total time spent: 3h 6m 46.416s (69.46 GHz-days / day) [/CODE] Next up is a little tuning to see if it can get above 70 GHz-days / day. |
Here is the results of mfakto tuning on the ASRock Deskmini A300W, AMD A8-9600, Radeon R7 IGPU, 16GB DDR-4, SSD, Windows 10, mfakto 0.15pre7, Prime95 v30.3 b6 system.
[CODE]mfakto tuning. AMD A8-9600, Radeon R7 IGPU, 16GB ram, SSD, Windows 10, mfakto 0.15pre7 exp=103122301 bit 73 to 74 Initial settings and speed. GPUSieveProcessSize=24, GPUSieveSize=96, GPUSievePrimes=81157, 67.97GHz-day Final settings and speed. GPUSieveProcessSize=32, GPUSieveSize=128, GPUSievePrimes=179766, 70.39GHz-day Speed with prime95 also running a P-1 stage 2, 70.32GHz-day Step 1, vary GPUSieveProcessSize Possible values: 8, 16, 24, 32 # Also must divide GPUSieveSize * 1024 # Default: GPUSieveProcessSize=24 GPUSieveProcessSize=8 64.93GHz-day GPUSieveProcessSize=16 67.10GHz-day GPUSieveProcessSize=24 67.97GHz-day GPUSieveProcessSize=32 68.68GHz-day * Step 2: vary GPUSieveSize with GPUSieveProcessSize=32 # Minimum: GPUSieveSize=4 # Maximum: GPUSieveSize=128 # Default: GPUSieveSize=96 GPUSieveSize=32 68.36GHz-day GPUSieveSize=64 68.54GHz-day GPUSieveSize=96 68.68GHz-day GPUSieveSize=128 68.70GHz-day * Step 3: vary GPUSievePrimes with GPUSieveSize=128, GPUSieveProcessSize=32 # Minimum: GPUSievePrimes=54 # Maximum: GPUSievePrimes=1075766 # Default: GPUSievePrimes=81157 GPUSievePrimes=21814 62.88GHz-day GPUSievePrimes=67894 68.35GHz-day GPUSievePrimes=81157 68.70GHz-day GPUSievePrimes=99894 69.10GHz-day GPUSievePrimes=120374 69.52GHz-day GPUSievePrimes=139830 69.84GHz-day GPUSievePrimes=160310 70.14GHz-day GPUSievePrimes=179766 70.39GHz-day* GPUSievePrimes=200246 70.22GHz-day [/CODE] |
I have created a PKGBUILD for this software for use on Arch. You can find it here: [url]https://aur.archlinux.org/packages/mfakto/[/url]
|
All times are UTC. The time now is 14:55. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.