mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2020-10-09, 23:35   #1651
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

32×241 Posts
Default

My guess would be drivers. (mfakto on my i5-4670k also uses a full cpu core, while on a i3-8100 it uses almost no resources)

Last fiddled with by kracker on 2020-10-09 at 23:37
kracker is offline   Reply With Quote
Old 2020-10-11, 19:20   #1652
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

25×3×5×7 Posts
Default

Quote:
Originally Posted by kracker View Post
mfakto on my ... i3-8100
Thanks for posting that -- my "other" computer has an i3-8100 and I didn't know it could run mfakto.
Free extra 21Ghd/d, thanks!
James Heinrich is offline   Reply With Quote
Old 2020-10-13, 09:51   #1653
Viliam Furik
 
"Viliam Furík"
Jul 2018
Martin, Slovakia

23·3·19 Posts
Default Radeon VII - poor performance against promised numbers

It was probably discussed earlier, but can anyone tell me, why can't I squeeze more than 1300 GHz-D/D (2.6 TFLOPS) from Radeon VII, despite it promising 13.8 TFLOPS (about 10 % more than RTX 2080Ti) of SP throughput?

Also in gpuOwl, only about 0.75 TFLOPS despite AMD promising 3.46 TFLOPS.

It seems to always be a fifth of what it should be according to the official AMD page.
Viliam Furik is offline   Reply With Quote
Old 2020-10-13, 10:42   #1654
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

25·3·5·7 Posts
Default

Quote:
Originally Posted by Viliam Furik View Post
It was probably discussed earlier, but can anyone tell me, why can't I squeeze more than 1300 GHz-D/D (2.6 TFLOPS) from Radeon VII, despite it promising 13.8 TFLOPS (about 10 % more than RTX 2080Ti) of SP throughput?
I'm not sure how you're calculating FLOPS performance in mfakto/gpuowl?

The mfakto performance you quote seems to be in line with expected mfakto performance (noting that my chart shows stock-clock performance). Also note AMD's somewhat deceptive practice of quoting "peak performance" numbers, which mean performance at 1750 boost instead of 1400 stock. The numbers on my pages are all stock-clock values.
James Heinrich is offline   Reply With Quote
Old 2020-10-13, 12:58   #1655
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,119 Posts
Default

Quote:
Originally Posted by Viliam Furik View Post
It was probably discussed earlier, but can anyone tell me, why can't I squeeze more than 1300 GHz-D/D (2.6 TFLOPS) from Radeon VII, despite it promising 13.8 TFLOPS (about 10 % more than RTX 2080Ti) of SP throughput?

Also in gpuOwl, only about 0.75 TFLOPS despite AMD promising 3.46 TFLOPS.

It seems to always be a fifth of what it should be according to the official AMD page.
If you're using mfakto as the measure of sp performance, note that mfakto & mfaktc use int32, not fp32.

Also the AMD spec sheet says "peak". Gpuowl is memory bandwidth constrained per Mihai. Which would make sustained rate difficult to maintain at peak rate.

https://www.amd.com/en/products/graphics/amd-radeon-vii

https://www.techpowerup.com/gpu-specs/radeon-vii.c3358
kriesel is offline   Reply With Quote
Old 2020-10-13, 14:19   #1656
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

5×1,889 Posts
Default

Quote:
Originally Posted by kriesel View Post
If you're using mfakto as the measure of sp performance, note that mfakto & mfaktc use int32, not fp32
This!
Radeon VII has a poor integer performance.
LaurV is offline   Reply With Quote
Old 2020-10-13, 14:21   #1657
axn
 
axn's Avatar
 
Jun 2003

136216 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
I'm not sure how you're calculating FLOPS performance in mfakto/gpuowl?
He's using standard Primenet conversion 1Gd/d = 2GFLOPS
axn is online now   Reply With Quote
Old 2020-10-27, 01:23   #1658
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,119 Posts
Default lsgpu

A small utility to list the OpenCL platforms and devices on them and a bit of description.
See https://www.mersenneforum.org/showpo...74&postcount=6

Could help the occasional mfakto installation startup.
kriesel is offline   Reply With Quote
Old 2020-12-06, 16:08   #1659
DrobinsonPE
 
Aug 2020

23×11 Posts
Default

After switching to Windows 10, I got mfakto to run on my ASRock Deskmini A300W, AMD A8-9600, 16GB DDR-4, SSD.

Code:
Selftest statistics
  number of tests           34026
  successful tests          34026

selftest PASSED!

mfakto 0.15pre7-MGW (64bit build)
OpenCL device info
  name                      Bristol Ridge (Advanced Micro Devices, Inc.)
  device (driver) version   OpenCL 2.0 AMD-APP (2841.19) (2841.19)
  maximum threads per block 1024
  maximum threads per grid  1073741824
  number of multiprocessors 6 (384 compute elements)
  clock rate                900 MHz

Automatic parameters
  threads per grid          0
  optimizing kernels for    GCN

Loading binary kernel file mfakto_Kernels.elf
Compiling kernels.
  GPUSievePrimes (adjusted) 81206
  GPUsieve minimum exponent 1037054
Started a simple selftest ...
Selftest statistics
  number of tests           30
  successful tests          30

selftest PASSED!

got assignment: exp=212335483 bit_min=73 bit_max=74 (9.01 GHz-days)
Starting trial factoring M212335483 from 2^73 to 2^74 (9.01 GHz-days)
Using GPU kernel "cl_barrett15_74_gs_2"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Dec 05 23:13 | 4617 100.0% | 11.685   0m00s |     69.39    81206    0.00%
no factor for M212335483 from 2^73 to 2^74 [mfakto 0.15pre7-MGW cl_barrett15_74_gs_2]
tf(): total time spent:  3h  6m 46.416s (69.46 GHz-days / day)
Next up is a little tuning to see if it can get above 70 GHz-days / day.
DrobinsonPE is offline   Reply With Quote
Old 2020-12-07, 09:01   #1660
DrobinsonPE
 
Aug 2020

23·11 Posts
Default

Here is the results of mfakto tuning on the ASRock Deskmini A300W, AMD A8-9600, Radeon R7 IGPU, 16GB DDR-4, SSD, Windows 10, mfakto 0.15pre7, Prime95 v30.3 b6 system.

Code:
mfakto tuning.
AMD A8-9600, Radeon R7 IGPU, 16GB ram, SSD, Windows 10, mfakto 0.15pre7
exp=103122301 bit 73 to 74 
Initial settings and speed. 
GPUSieveProcessSize=24, GPUSieveSize=96, GPUSievePrimes=81157, 67.97GHz-day
Final settings and speed.
GPUSieveProcessSize=32, GPUSieveSize=128, GPUSievePrimes=179766, 70.39GHz-day
Speed with prime95 also running a P-1 stage 2, 70.32GHz-day

Step 1, vary GPUSieveProcessSize
Possible values: 8, 16, 24, 32
# Also must divide GPUSieveSize * 1024
# Default: GPUSieveProcessSize=24

GPUSieveProcessSize=8	64.93GHz-day
GPUSieveProcessSize=16	67.10GHz-day
GPUSieveProcessSize=24	67.97GHz-day
GPUSieveProcessSize=32	68.68GHz-day	*

Step 2: vary GPUSieveSize with GPUSieveProcessSize=32
# Minimum: GPUSieveSize=4
# Maximum: GPUSieveSize=128
# Default: GPUSieveSize=96
GPUSieveSize=32		68.36GHz-day
GPUSieveSize=64		68.54GHz-day
GPUSieveSize=96		68.68GHz-day
GPUSieveSize=128	68.70GHz-day	*

Step 3: vary GPUSievePrimes with GPUSieveSize=128, GPUSieveProcessSize=32
# Minimum: GPUSievePrimes=54
# Maximum: GPUSievePrimes=1075766
# Default: GPUSievePrimes=81157
GPUSievePrimes=21814	62.88GHz-day
GPUSievePrimes=67894	68.35GHz-day	
GPUSievePrimes=81157	68.70GHz-day	
GPUSievePrimes=99894	69.10GHz-day
GPUSievePrimes=120374     69.52GHz-day
GPUSievePrimes=139830     69.84GHz-day
GPUSievePrimes=160310     70.14GHz-day
GPUSievePrimes=179766     70.39GHz-day*
GPUSievePrimes=200246     70.22GHz-day
DrobinsonPE is offline   Reply With Quote
Old 2020-12-10, 06:07   #1661
Dylan14
 
Dylan14's Avatar
 
"Dylan"
Mar 2017

577 Posts
Default

I have created a PKGBUILD for this software for use on Arch. You can find it here: https://aur.archlinux.org/packages/mfakto/
Dylan14 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
gpuOwL: an OpenCL program for Mersenne primality testing preda GpuOwl 2707 2021-05-11 00:48
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3493 2021-04-24 17:09
LL with OpenCL msft GPU Computing 433 2019-06-23 21:11
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
Program to TF Mersenne numbers with more than 1 sextillion digits? Stargate38 Factoring 24 2011-11-03 00:34

All times are UTC. The time now is 06:16.

Wed May 12 06:16:17 UTC 2021 up 34 days, 57 mins, 0 users, load averages: 2.02, 1.73, 1.73

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.