mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2012-04-18, 11:39   #1761
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11·101 Posts
Default

Quote:
Originally Posted by LaurV View Post
I think that the author is the one who can decide if he wants it to make it public or not. Of course I would like to have it, to play with it, but maybe he has a good reason why didn't make it public. It could be still under test, or under development,
[...]
Yepp, it isn't very well tested. Without modifications in the sieve code I have to limit SievePrimes to very low values when small exponents are tested. Otherwise the code will just hang (endless loop in offset calculation).
So with such small SievePrimes it seems to work well but again: this is not tested very well.
There are two issues when the biggest prime is the sieve is greater or equal then the exponent to test:
  1. remove the prime exponent from the list of primes used for sieving. (Candidates are 2kp+1 so they are never divisible by p. This causes the endless loop.)
  2. offset calculation needs to take care that the primes used for sieving can be a factor of M[B]p[/B]. This occurs as soon as the primes used for sieving are greater than 2p.
So the simple solution is to limits SievePrimes in that way that the biggest prime in the sieve list is less than the exponent.

Another potential issue is the GPU code itself. Precalculation (mfaktc-0.18/src/tf_common.cu starting at line 89) might be too gready for small exponents. Big bit_min with small exponents need to be reviewed.

And another potential issue: floating point accuracy for approximation in the long divisions need to be checked for small exponents (small factor candidates).

Because there is only little use to do more TF on such small exponents this isn't high priority on my todo list, sorry!

Oliver
TheJudger is offline   Reply With Quote
Old 2012-04-22, 17:08   #1762
c10ck3r
 
c10ck3r's Avatar
 
Aug 2010
Kansas

54710 Posts
Default

Taking 84M to 67 bits, just finished taking 334M up a bitlevel.
Edit: Taking 84M to 67 and finishing the other half of 334M to 66.

Last fiddled with by c10ck3r on 2012-04-22 at 17:16 Reason: I lied :/
c10ck3r is offline   Reply With Quote
Old 2012-04-24, 15:39   #1763
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

469210 Posts
Default My first time...and not so lucky

Tinkering wth my GeForce 8400GS
Does that sound kinky?

Anyway getting suspicios errors:

Code:
running a simple selftest...
ERROR: cudaGetLastError() returned 6: the launch timed out and was terminated
This is after about 45 seconds at which time the screen goes blank and then Windows7 64-bit pops up a window in the taskbar that says something like: Display Driver stopped responding and was recovered.

Only one time out of about 7 tries it did pass the selftest and then said it could not read the worktodo.txt file.

2 questions:

1. I assume it should be in the same directory as the mfaktc?
2. From the readme there is this sample line: Factor=bla,66362159,64,68
Is the bla, required?

Per James suggestion I changed Numstreams to 2 but my GUI is still quite laggy.
petrw1 is offline   Reply With Quote
Old 2012-04-24, 15:52   #1764
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

11×311 Posts
Default

Quote:
Originally Posted by petrw1 View Post
Is the bla, required?
Bla is not required. mfaktc is pretty good at ignoring invalid worktodo now, but the assignment key should either not be there at all, "N/A", or a 32-char hex string
Code:
Factor=66362159,64,68
Factor=N/A,66362159,64,68
Factor=6F466E3E1BCBC848ACA66E45ACBAC5FD,66362159,64,68
I'm sure you're aware, but the 8400 GS is a feeble card, depending on the CPU it's paired with it could potentially achieve less combined TF throughput than just the CPU alone...

The #1 way to make compute 1.x cards usable with mfaktc is to change GridSize=0 (anything larger than that will lag noticeably). The values for streams are far less important.
James Heinrich is offline   Reply With Quote
Old 2012-04-24, 15:58   #1765
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

22·3·17·23 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
I'm sure you're aware, but the 8400 GS is a feeble card, depending on the CPU it's paired with it could potentially achieve less combined TF throughput than just the CPU alone...
Thanks...I suspected as much. I just wanted to see if I could get it to work.

It is paired with a i5-750 OC'd 3.20
petrw1 is offline   Reply With Quote
Old 2012-04-24, 17:06   #1766
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

11·311 Posts
Default

Quote:
Originally Posted by petrw1 View Post
It is paired with a i5-750 OC'd 3.20
Your i5-750 can do about 3.75GHz-days/day of TF. Your 8400 GS can do about 3.1 or 2.4GHz-days/day (depending on the revision). Which means that if you spend any more than 82% or 64% of a single CPU core feeding mfaktc, you're losing throughput.

On my slower system I have an 8800GT. I've locked mfaktc on SievePrimes=5000, which isn't optimal for GPU throughput but leaves 90% of the CPU core free to work on P-1.
James Heinrich is offline   Reply With Quote
Old 2012-04-25, 16:02   #1767
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

111110 Posts
Default

petrw1: the 8400GS might be too slow for real jobs but if you just want to see how/if mfaktc works it will still do the job. As James recommended: lower the GridSize, otherwise there is a high chance that you'll trigger the watchdog for blocked drivers on Windows, the 8400GS is really slow.

Oliver
TheJudger is offline   Reply With Quote
Old 2012-04-30, 20:58   #1768
c10ck3r
 
c10ck3r's Avatar
 
Aug 2010
Kansas

54710 Posts
Default Por favor...

Quote:
Originally Posted by James Heinrich View Post
*SNIP*I've locked mfaktc on SievePrimes=5000, which isn't optimal for GPU throughput but leaves 90% of the CPU core free to work on P-1.
Mein Herr, could you (or another) explain what "SievePrimes" does? I assume (and cringe at the word) that it sieves the k values for, say, values that are divisible by 7 (6 mod 7 for exponents ending in 9), but I'm not sure. Also, please try to explain as simply as possible, I'm learning as I go :)
Thanks!
Johann
c10ck3r is offline   Reply With Quote
Old 2012-04-30, 21:12   #1769
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by c10ck3r View Post
Mein Herr, could you (or another) explain what "SievePrimes" does? I assume (and cringe at the word) that it sieves the k values for, say, values that are divisible by 7 (6 mod 7 for exponents ending in 9), but I'm not sure. Also, please try to explain as simply as possible, I'm learning as I go :)
Thanks!
Johann
It's the count of the number of primes to sieve n=2kp+1 with. The more primes you use in sieving, the more not-prime-n you eliminate, but of course the law of diminishing returns applies; the important fact is that this sieving-candidates is done on the CPU, while the actual trying-candidates happens on the GPU, so that the SP count is effectively how much work the CPU has to do before a candidate is sent to the GPU. If the CPU can't keep up with the GPU, lower sieve primes so it's doing less work; if the GPU can't keep up, increase SP so the CPU does more work.

Last fiddled with by Dubslow on 2012-04-30 at 21:31 Reason: i always quick reply without quoting
Dubslow is offline   Reply With Quote
Old 2012-04-30, 21:12   #1770
Brain
 
Brain's Avatar
 
Dec 2009
Peine, Germany

331 Posts
Default Quick short reply

Quote:
Originally Posted by c10ck3r View Post
Mein Herr, could you (or another) explain what "SievePrimes" does? I assume (and cringe at the word) that it sieves the k values for, say, values that are divisible by 7 (6 mod 7 for exponents ending in 9), but I'm not sure. Also, please try to explain as simply as possible, I'm learning as I go :)
Thanks!
Johann
Find on page 2 of this thread this explanation.
Brain is offline   Reply With Quote
Old 2012-04-30, 21:18   #1771
TObject
 
TObject's Avatar
 
Feb 2012

34·5 Posts
Default

What is the “CPU Wait”? The bigger the % the worse the CPU is keeping up? Or is it the other way around?

Thanks

Last fiddled with by TObject on 2012-04-30 at 21:25 Reason: Remove MS Word formatting
TObject is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 07:32.


Mon Aug 2 07:32:08 UTC 2021 up 10 days, 2:01, 0 users, load averages: 1.45, 1.31, 1.38

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.