mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2011-12-21, 18:52   #287
Brain
 
Brain's Avatar
 
Dec 2009
Peine, Germany

331 Posts
Default

Quote:
Originally Posted by Bdot View Post
Here's the fix for the performance issues. It just contains 2 kernel files that need to replace original files from the 0.10 package.
Wouldn't it be easier to integrate the patch files also into the former 0.10 bundles? I'm waiting with my GPU guide update and liked to minimize download url count...
Brain is offline   Reply With Quote
Old 2011-12-22, 10:06   #288
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3·199 Posts
Default

Quote:
Originally Posted by Brain View Post
Wouldn't it be easier to integrate the patch files also into the former 0.10 bundles? I'm waiting with my GPU guide update and liked to minimize download url count...
As you wish ... (of course you're right ...)
Attached Files
File Type: zip mfakto-0.10p1 - Win.zip (199.6 KB, 236 views)

Last fiddled with by Bdot on 2011-12-22 at 10:23
Bdot is offline   Reply With Quote
Old 2011-12-23, 18:30   #289
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

16FE16 Posts
Default

In yafu it is recommended to use a 64kb sieve on amd cpus and 32kb sieve on intel because of a smaller L1 cache. Bulldozer goes down to a 16kb data cache so might want smaller.
henryzz is offline   Reply With Quote
Old 2011-12-24, 10:59   #290
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Default

Quote:
Originally Posted by henryzz View Post
In yafu it is recommended to use a 64kb sieve on amd cpus and 32kb sieve on intel because of a smaller L1 cache. Bulldozer goes down to a 16kb data cache so might want smaller.
With MORE_CLASSES, mfakt[co] uses a sieve size that is a multiple of ~12k (13*17*19*23 bits), which results in ~24k optimum for Intel, and ~60k for AMD (12k for BullD). flashjh, did you give the different sieve-size versions a try on your Phenom to confirm?

I guess the next version of mfakto will have sieve size configurable ...

BTW, I had a chance to quickly test the A350 with mfakto. Windows7-64 and Catalyst 11.12 installed, and there's nothing more that is needed. The GPU is detected right away. However, it may not really be worth the effort: ~7M/s was the peak.

I'll test a little more though.
CPU load (mfakto, SievePrimes 200k): ~17%
GPU load : 85-95%
no measureable increase in power-consumption
M52 50xx xxx (2^69 - 2^70): 6.8M/s avg.

Last fiddled with by Bdot on 2011-12-24 at 11:00 Reason: typos, typos, typos
Bdot is offline   Reply With Quote
Old 2011-12-24, 19:31   #291
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

Quote:
Originally Posted by Bdot View Post
With MORE_CLASSES, mfakt[co] uses a sieve size that is a multiple of ~12k (13*17*19*23 bits), which results in ~24k optimum for Intel, and ~60k for AMD (12k for BullD). flashjh, did you give the different sieve-size versions a try on your Phenom to confirm?

I guess the next version of mfakto will have sieve size configurable ...

BTW, I had a chance to quickly test the A350 with mfakto. Windows7-64 and Catalyst 11.12 installed, and there's nothing more that is needed. The GPU is detected right away. However, it may not really be worth the effort: ~7M/s was the peak.

I'll test a little more though.
CPU load (mfakto, SievePrimes 200k): ~17%
GPU load : 85-95%
no measureable increase in power-consumption
M52 50xx xxx (2^69 - 2^70): 6.8M/s avg.
I just finished testing two 5870s with a Phenom x6 1055T. I ran 4 instances, 2 per GPU. All instances are running 70-72 with no stages. SievePrimes is set to autoadjust.

32k 64 bit exe:
GPU 1 runs ~20.6 sec per class with SievePrimes at ~28000.
GPU 2 runs ~22.0 sec per class with SievePrimes at ~36000.

64k 64 bit exe:
GPU 1 runs ~20.0 sec per class with SievePrimes at ~41000.
GPU 2 runs ~20.5 sec per class with SievePrimes at ~54000.

Average CPU wait time for all instances is between 200-400us.

Usage:
CPU: 75%
GPU 1: 73%
GPU 2: 85%
flashjh is offline   Reply With Quote
Old 2011-12-24, 23:49   #292
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

112310 Posts
Default Warning Question

I have several TF DC assignments for my P4 3.4 system with a HD 4670. Anyway, I noticed it kept giving a warning about a particular exponent and would skip it. I finally got around to messing with it.

Factor=N/A,27960979,68,69

Always gives:

WARNING: exponent is not prime!
Ignoring TF M27960979 from 2^68 to 2^69!
WARNING: ignoring line 1 in "worktodo.txt"! Reason: invalid data

So, I know it's not prime and both mfak(co) say the same thing... is there any way to fulfill my GPU TF on this exponent or do I need to use Prime95 for this one?
flashjh is offline   Reply With Quote
Old 2011-12-25, 00:32   #293
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11·101 Posts
Default

Quote:
Originally Posted by flashjh View Post
[...]
Factor=N/A,27960979,68,69

Always gives:

WARNING: exponent is not prime!
Ignoring TF M27960979 from 2^68 to 2^69!
WARNING: ignoring line 1 in "worktodo.txt"! Reason: invalid data

So, I know it's not prime and both mfak(co) say the same thing...
Well, no surprise that both, mfaktc and mfakto, tell you that this exponent is not prime... it is the same code!

Quote:
Originally Posted by flashjh View Post
is there any way to fulfill my GPU TF on this exponent or do I need to use Prime95 for this one?
Take the sourcecode and disable the check for prime only exponents.
This will work for your assignment (M27960979 from 268 to 269) because the start is "big enough". The problem with non-prime exponents is that the prime factors can be very small (for prime exponents the smallest possible factor of Mp is 2kp+1 but for composite exponents they can be much smaller than the exponent itself). Those very small factors can be sieved out before testing just because there is no code written which takes care of this.

Oliver
TheJudger is offline   Reply With Quote
Old 2011-12-25, 01:40   #294
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

46316 Posts
Cool

Quote:
Originally Posted by TheJudger View Post
Well, no surprise that both, mfaktc and mfakto, tell you that this exponent is not prime... it is the same code!
Fair enough... I was just making sure everyone knew I tested both and since I hadn't seen this before in my readings I didn't want that to be the reason
flashjh is offline   Reply With Quote
Old 2011-12-25, 02:58   #295
axn
 
axn's Avatar
 
Jun 2003

5,087 Posts
Default

Quote:
Originally Posted by flashjh View Post
So, I know it's not prime and both mfak(co) say the same thing... is there any way to fulfill my GPU TF on this exponent or do I need to use Prime95 for this one?
If the exponent is not prime, you have an invalid exponent (possibly due to a typo). Find the correct exponent and do the TF on it. If you can't find the correct exponent, throw out that line. GIMPS does not deal with composite exponents. Even P95 will balk at that one.
axn is online now   Reply With Quote
Old 2011-12-26, 05:03   #296
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

46316 Posts
Default Reporting Question

Lesson learned for me... I copy and paste the lists in, but somehow I messed up that one. I fixed it to match my assignments (one digit was off) and everything worked. Thanks for your help.

And a queston... Has anyone noticed PrimeNet result changes? I now use the spider to post my results (which is awesome by the way). I noticed that PrimeNet now shows all my 'factor' results as F-PM1 instead of just F. The results column has the correct factor, but since I use mfakt(oc) for all my TFing I was wondering why PrimeNet is showing the change. Is the spider makeing a mistake when reporting or is PrimeNet making a mistake? Also, PrimeNet and GPU to 72 don't show the same GHz days since PrimeNet thinks it was found with P1.


Jerry
flashjh is offline   Reply With Quote
Old 2011-12-26, 20:04   #297
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

722110 Posts
Default

http://www.mersenneforum.org/showthr...=12827&page=58

Last post there ^, and there's a few posts on the next page. I'd read through changelog.txt as well. This is a known issue and hopefully will be fixed soon.
Dubslow is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
gpuOwL: an OpenCL program for Mersenne primality testing preda GpuOwl 2719 2021-08-05 22:43
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3497 2021-06-05 12:27
LL with OpenCL msft GPU Computing 433 2019-06-23 21:11
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
Program to TF Mersenne numbers with more than 1 sextillion digits? Stargate38 Factoring 24 2011-11-03 00:34

All times are UTC. The time now is 13:02.


Fri Aug 6 13:02:29 UTC 2021 up 14 days, 7:31, 1 user, load averages: 3.01, 2.93, 2.74

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.