mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2014-10-28, 13:49   #1233
tului
 
Jan 2013

22×17 Posts
Default

A10-5745M results.
Attached Files
File Type: 7z a10-5745m.7z (14.2 KB, 76 views)
tului is offline   Reply With Quote
Old 2014-10-28, 22:31   #1234
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Default

Quote:
Originally Posted by tului View Post
A10-5745M results.
Thank you for that one too. Your A10 will most likely benefit from using VectorSize=4. Would you mind changing that in the .ini files and just rerun the m-gs-<nn>-<nn> tests (skip the long-running *fulltest.ini and *GCN*ini tests)? And please add an m-gs-128-32.ini file. Copy from m-gs-128-16.ini and set GPUSieveProcessSize=32. Then, please run mfakto [-d ..] -i m-gs-128-32.ini --perftest > m-gs-128-32.log

Now, that I have the "automatic" evaluation of kernel speeds, I will think about auto-adjusting VectorSize as well, but we're not there yet.

Last fiddled with by Bdot on 2014-10-28 at 22:33
Bdot is offline   Reply With Quote
Old 2014-10-29, 19:42   #1235
NickOfTime
 
Apr 2014

2A16 Posts
Default

Quote:
Originally Posted by Bdot View Post
Very nice! All seems to work as expected. I'll need a bit more time to go through the results, the first things I noticed:
  • 290x seems to have major improvements in int32 performance: the 32-bit kernels are now 20% faster than the 15-bit ones (on previous GCN, they are 15% slower). As the current code does not yet honor this, expect a 20% performance boost with the next mfakto version, and no more performance drop at the 73-bit-boundary.
  • 290x behaves pretty much like GCN regarding ini file settings: according to these tests, GPUSieveSize=126 and GPUSieveProcessSize=24 should be fastest on this card as well.
  • 290x: Measuring the CPU-sieve-based TF kernels only worked for the smallest exponent, the other results are way too low - either something overflowed, some throttling kicked in or my test did not fully utilize the GPU.
  • 1/8 DP rate on 290x is not sufficient to give DP calculations an advantage over SP. Therefore, only Tahiti and Malta chips will use DP in mfakto. Has anyone still some HD5870/5850 sitting around? This one would also be a good candidate.
  • HD4600 worked well, delivering 18-19 GHz-days/day for current LL test range.
  • performance dependency to the exponent size is stronger than I expected, e.g. 290x: 975GHz (2M), 770GHz (39M), 739GHz (78M), 684GHz (332M), 616GHz (4200M).
  • I missed to include an m-gs-128-32.ini file. Could you please copy from m-gs-128-16.ini and set GPUSieveProcessSize=32. Then, please run mfakto [-d ..] -i m-gs-128-32.ini --perftest > m-gs-128-32.log
    I think I know the outcome for HD7770, but HD4600 and R290x would be interesting.
m-gs-128-32-290x.7z edit: hmm, and an extra 0.5 ghz-day/day if numstreams=5 vs 3...

Last fiddled with by NickOfTime on 2014-10-29 at 20:08
NickOfTime is offline   Reply With Quote
Old 2014-10-30, 23:51   #1236
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23·271 Posts
Default

Quote:
Originally Posted by Bdot View Post
  • I missed to include an m-gs-128-32.ini file. Could you please copy from m-gs-128-16.ini and set GPUSieveProcessSize=32. Then, please run mfakto [-d ..] -i m-gs-128-32.ini --perftest > m-gs-128-32.log
    I think I know the outcome for HD7770, but HD4600 and R290x would be interesting.
Late result, sorry...
Intel HD4600
Attached Files
File Type: txt m-gs-128-32.log.txt (14.7 KB, 140 views)

Last fiddled with by kracker on 2014-10-31 at 00:01
kracker is offline   Reply With Quote
Old 2014-10-31, 21:45   #1237
AK76
 
Sep 2014

19 Posts
Default

testresults R9 290.7z
AK76 is offline   Reply With Quote
Old 2014-10-31, 23:05   #1238
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23×271 Posts
Default

Quote:
Originally Posted by AK76 View Post
Something's quite wrong there... the results are quite different from NickOfTime's 290X results, the 290 should be not much slower than it's X version..
kracker is offline   Reply With Quote
Old 2014-11-01, 02:35   #1239
NickOfTime
 
Apr 2014

2×3×7 Posts
Default

Quote:
Originally Posted by kracker View Post
Something's quite wrong there... the results are quite different from NickOfTime's 290X results, the 290 should be not much slower than it's X version..
Well, mine are XFX 290x Double Dissipation (PCIEx 16x) running around 66 - 70C so there is no throttling...
Hmm, it probably is the Catalyst Version, since opencl is compiled at runtime ,there is a lot of variations of performance diffs there, I am using 14.3
NickOfTime is offline   Reply With Quote
Old 2014-11-01, 13:05   #1240
Cruelty
 
Cruelty's Avatar
 
May 2005

23·7·29 Posts
Default

Attached results for Sapphire R9 290 Tri-X OC. As a bonus I've put also log from GPU-Z for entire run
Attached Files
File Type: 7z testresults_Sapphire_R9_290_Tri-X_OC.7z (33.8 KB, 81 views)
Cruelty is offline   Reply With Quote
Old 2014-11-01, 16:37   #1241
AK76
 
Sep 2014

19 Posts
Default

Quote:
Originally Posted by kracker View Post
Something's quite wrong there... the results are quite different from NickOfTime's 290X results, the 290 should be not much slower than it's X version..
Hmm i don't really know where is problem. I run mfakto on Catalyst 14.4 and 14.9 - results are practically the same.

My plaftorm is 6 years old Asus P5K pro and Xeon E5440 2,83 GHz, which is overclocked to 3,7 GHz. FSB default is 333 MHz and now is set to 443 MHz. It might cause the problem?
AK76 is offline   Reply With Quote
Old 2014-11-02, 02:40   #1242
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23×271 Posts
Default

Quote:
Originally Posted by AK76 View Post
Hmm i don't really know where is problem. I run mfakto on Catalyst 14.4 and 14.9 - results are practically the same.

My plaftorm is 6 years old Asus P5K pro and Xeon E5440 2,83 GHz, which is overclocked to 3,7 GHz. FSB default is 333 MHz and now is set to 443 MHz. It might cause the problem?
Hmm... how are your thermals?

On another note, I've ordered a R9 285... Will have results by Tuesday/Wednesday.
kracker is offline   Reply With Quote
Old 2014-11-02, 02:45   #1243
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23·271 Posts
Default

Quote:
Originally Posted by Cruelty View Post
Attached results for Sapphire R9 290 Tri-X OC. As a bonus I've put also log from GPU-Z for entire run
Nice! Well, here's comparing your 290 with AK76's 290... something's wrong.

Code:
 5. GPU tf kernels, exponent=66362159 ... calibrating
5. GPU tf kernels, exponent=66362159, 12287M FCs each      
k=2223766598517, 0.900843 GHz-days (assignment), 0.025120 GHz-days (per test)
  cl_barrett32_79_gs [64-79]:  3609.79 ms ==>  3569.43M FCs/s ==>  601.23 GHz-days/day
  cl_barrett32_77_gs [64-77]:  3230.19 ms ==>  3988.90M FCs/s ==>  671.89 GHz-days/day
  cl_barrett32_76_gs [64-76]:  3097.96 ms ==>  4159.15M FCs/s ==>  700.57 GHz-days/day
  cl_barrett32_92_gs [65-92]:  5107.85 ms ==>  2522.57M FCs/s ==>  424.90 GHz-days/day
  cl_barrett32_88_gs [65-88]:  3778.44 ms ==>  3410.12M FCs/s ==>  574.40 GHz-days/day
  cl_barrett32_87_gs [65-87]:  3610.30 ms ==>  3568.93M FCs/s ==>  601.15 GHz-days/day
  cl_barrett15_73_gs [60-73]:  3728.22 ms ==>  3456.05M FCs/s ==>  582.14 GHz-days/day
  cl_barrett15_69_gs [60-69]:  3141.44 ms ==>  4101.59M FCs/s ==>  690.87 GHz-days/day
  cl_barrett15_70_gs [60-69]:  3145.50 ms ==>  4096.30M FCs/s ==>  689.98 GHz-days/day
Code:
 5. GPU tf kernels, exponent=66362159 ... calibrating
5. GPU tf kernels, exponent=66362159, 6143M FCs each      
k=2223766598517, 0.900843 GHz-days (assignment), 0.012560 GHz-days (per test)
  cl_barrett32_79_gs [64-79]:  2755.63 ms ==>  2337.92M FCs/s ==>  393.80 GHz-days/day
  cl_barrett32_77_gs [64-77]:  2554.79 ms ==>  2521.71M FCs/s ==>  424.76 GHz-days/day
  cl_barrett32_76_gs [64-76]:  2486.19 ms ==>  2591.30M FCs/s ==>  436.48 GHz-days/day
  cl_barrett32_92_gs [65-92]:  3022.35 ms ==>  2131.60M FCs/s ==>  359.05 GHz-days/day
  cl_barrett32_88_gs [65-88]:  2844.33 ms ==>  2265.02M FCs/s ==>  381.52 GHz-days/day
  cl_barrett32_87_gs [65-87]:  2756.18 ms ==>  2337.46M FCs/s ==>  393.72 GHz-days/day
  cl_barrett15_73_gs [60-73]:  2815.09 ms ==>  2288.54M FCs/s ==>  385.48 GHz-days/day
  cl_barrett15_69_gs [60-69]:  2507.41 ms ==>  2569.36M FCs/s ==>  432.78 GHz-days/day
  cl_barrett15_70_gs [60-69]:  2508.61 ms ==>  2568.14M FCs/s ==>  432.58 GHz-days/day

Last fiddled with by kracker on 2014-11-02 at 02:46
kracker is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
gpuOwL: an OpenCL program for Mersenne primality testing preda GpuOwl 2719 2021-08-05 22:43
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3497 2021-06-05 12:27
LL with OpenCL msft GPU Computing 433 2019-06-23 21:11
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
Program to TF Mersenne numbers with more than 1 sextillion digits? Stargate38 Factoring 24 2011-11-03 00:34

All times are UTC. The time now is 01:06.


Fri Aug 6 01:06:09 UTC 2021 up 13 days, 19:35, 1 user, load averages: 2.37, 2.40, 2.33

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.