mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2011-07-06, 02:20   #1057
apsen
 
Jun 2011

131 Posts
Default

Quote:
Originally Posted by Christenson View Post
If you wade far enough back in this thread (it's 700 posts long!), you will find the early versions of mfaktc.
I've went through the whole thread and the latest version that worked for me was 0.8. The later versions seem to be compiled for cc1.1 or higher. But when I tried to compile 0.8 for Win64on my own I'm getting "cudaStreamCreate() failed". What may I be doing wrong?
apsen is offline   Reply With Quote
Old 2011-07-06, 04:25   #1058
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

5×359 Posts
Default

Quote:
Originally Posted by apsen View Post
I've went through the whole thread and the latest version that worked for me was 0.8. The later versions seem to be compiled for cc1.1 or higher. But when I tried to compile 0.8 for Win64on my own I'm getting "cudaStreamCreate() failed". What may I be doing wrong?
Hopefully you gave the locations for those versions to Rodrigo's thread...otherwise someone else in your shoes will end up walking the same long mile.

At a guess, you need a copy of cudart.dll in the same directory as your executable and/or current working directory. See Rodrigo's thread for the pointers.

At a second guess, your card can only support one stream at a time, so try telling it to use only one stream.

To really know, you will need to get the error code from cudaStreamCreate, which will require you to program a bit. You will then have to go look it up in the Nvidia documentation. I can modify the code, but can't compile for Win32. Let me know if I need to do that.
Christenson is offline   Reply With Quote
Old 2011-07-06, 13:34   #1059
apsen
 
Jun 2011

2038 Posts
Default

Quote:
Originally Posted by Christenson View Post
Hopefully you gave the locations for those versions to Rodrigo's thread...otherwise someone else in your shoes will end up walking the same long mile.
No. I don't really have an answer yet. So far I could only tell that version 0.8 seems to be the best bet.

Quote:
Originally Posted by Christenson View Post
At a guess, you need a copy of cudart.dll in the same directory as your executable and/or current working directory. See Rodrigo's thread for the pointers.

At a second guess, your card can only support one stream at a time, so try telling it to use only one stream.

To really know, you will need to get the error code from cudaStreamCreate, which will require you to program a bit. You will then have to go look it up in the Nvidia documentation. I can modify the code, but can't compile for Win32. Let me know if I need to do that.
There's problem with my compile. The downloaded 0.8 works - mine doesn't. The return code from cudaStreamCreate is 10200 and it's out of range of defined codes in cuda.h.

To address your specific points:
Yes I have cudart.dll - without it the program will not even start.
The downloaded 0.8 works with 3 streams.

It just occured to me that I may need to use even older CUDA toolkit...

Maybe kjaget could chime in...

BTW 0.8 is posted in post #280.

Last fiddled with by apsen on 2011-07-06 at 13:36 Reason: Additional information
apsen is offline   Reply With Quote
Old 2011-07-06, 16:13   #1060
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

70316 Posts
Default

10200 = 27D8....you sure you have the right return-type declared for cudaStreamCreate?

If you are just trying to run mfaktc, I'd be inclined to ignore the "I can't build it" problem. What do you hope to do with the modification?
Christenson is offline   Reply With Quote
Old 2011-07-09, 13:23   #1061
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

21278 Posts
Default

Quote:
Originally Posted by Prime95 View Post
I think that's unlucky but not suspicious. Xyzzy's was worrisome because it involved 9000 tests.
Some data from my latest two runs (regular TF runs as assigned from primenet server in M58.xxx.xxx to M60.4xx.xxx)

1st batch
  • 1956 assignments from 2^69 to 2^70
  • 1932 no factor results
  • 24 factor results (25 factors, one exponent has 2 factors between 2^69 and 2^70)
Expected number of factors: 1956/69 = ~28.35

2nd batch
  • 2089 assignments from 2^69 to 2^70
  • 2051 no factor results
  • 38 factor results (38 factors)
Expected number of factors: 2089/69 = ~30.41

I feel comfortable with these results.

These runs included 300+ "no factor results" in a row aswell as "5 factors from ~50 assignments"

Oliver

Last fiddled with by TheJudger on 2011-07-09 at 13:23
TheJudger is offline   Reply With Quote
Old 2011-07-10, 03:11   #1062
davieddy
 
davieddy's Avatar
 
"Lucan"
Dec 2006
England

145128 Posts
Default Putting some flesh on the bones

Quote:
Originally Posted by TheJudger View Post
Some data from my latest two runs (regular TF runs as assigned from primenet server in M58.xxx.xxx to M60.4xx.xxx)






1st batch
  • 1956 assignments from 2^69 to 2^70
  • 1932 no factor results
  • 24 factor results (25 factors, one exponent has 2 factors between 2^69 and 2^70)
Expected number of factors: 1956/69 = ~28.35






2nd batch
  • 2089 assignments from 2^69 to 2^70
  • 2051 no factor results
  • 38 factor results (38 factors)
Expected number of factors: 2089/69 = ~30.41

I feel comfortable with these results.

These runs included 300+ "no factor results" in a row aswell as "5 factors from ~50 assignments"

Oliver
I feel comfortable that your results in no way make one doubt
the hypothesis that you conducted 4045 independent trials,
the probability of a "success" being 1/69. (See the Kamasutra).

2 factors in one trial? 1/692 = 1/4761. Found one. Tick.

Expected 60 successes +/- 8. Found 62. Tick.

Expected number of "gaps">300 = (68/69)300 *60 = 0.75
Probabity of no such gaps e^-0.75 = 0.47. Found one. Tick.

Probability of a gap <28 ~1/3. For 4 such gaps in succession,
expected total gap ~50.
Probability of 4 or more such gaps in a row =1/81
Found one in 60. Tick.

Thoughtful comments on this analysis welcome.

David

Last fiddled with by davieddy on 2011-07-10 at 03:37
davieddy is offline   Reply With Quote
Old 2011-07-10, 04:17   #1063
davieddy
 
davieddy's Avatar
 
"Lucan"
Dec 2006
England

2×3×13×83 Posts
Default

Quote:
Originally Posted by davieddy View Post
2 factors in one trial? 1/692
From experience, this might be off by a factor of 2 either way
Enough thinking for now!
davieddy is offline   Reply With Quote
Old 2011-07-10, 07:56   #1064
davieddy
 
davieddy's Avatar
 
"Lucan"
Dec 2006
England

2×3×13×83 Posts
Default

Quote:
Originally Posted by davieddy View Post
Probability of a gap <28 ~1/3. For 4 such gaps in succession,
expected total gap ~50.
Probability of 4 or more such gaps in a row =1/81
Expect 60*2/3=40 "long" gaps (>= 28).
Each of them has a 1/81 chance of being followed by 4+ "short" gaps.
So expected runs of 4+ short gaps is 0.5

This might seem a strange way to approach the question of
finding 5 factors in ~50 tests, but the Poisson distribution tells us
that if we expect 50/69 factors in a randomly selected 50 tests,
the probability of 5+ factors is 0.0009. 4045/50 = 81, so this way we
would expect 0.0729 occurrences of 5 ln 50. It is clear to see why this
underestimates the likelihood of finding some run of 5 factors in 50 tests,
but very hard to see how to adjust it.

Note that this problem is the same as judging how lucky GIMPS
has been to find 7 "short" gaps in a row between Mprimes. In this case,
50% of gaps are short and 50% are long (the boundary being an
exponent ratio of ~1.3)
My conclusion is that we expect 1 such run in 256 Mprimes:
lucky yes, outrageous no.

Quote:
Originally Posted by davieddy View Post
From experience, 1/692 might be off by a factor of 2
Pretty sure it should be 1/2! * 1/692 (Poisson again)

David

Last fiddled with by davieddy on 2011-07-10 at 08:37
davieddy is offline   Reply With Quote
Old 2011-07-13, 15:26   #1065
apsen
 
Jun 2011

131 Posts
Default



On my 4 core system mfaktc 0.17 performance suffer if something is running on other cores. I do not see that effect on 2 core system with mfaktc 0.8.

For example:

I run only mfaktc 0.17 (on core #4) i get about 93M/s.
If I'll start one Prime95 worker on another core (say #1) it drops to about 75M/s.
If I'll start two Prime95 workers on another core (say #1 and #2) it drops to about 69M/s.
If I'll start three Prime95 workers on another core (say #1, #2 and #3) it drops to about 35M/s.

Forth worker has almost no effect as mfaktc runs at higher priority.

Win 7 x64, Q6600, GTX 465, mfaktc 0.17.

The other computer (Win 7 x64, E5200, 8800 GTS, mfaktc 0.8) has a consistent output of about 26M/s no matter whether Prime95 is running or not.


Last fiddled with by apsen on 2011-07-13 at 15:33
apsen is offline   Reply With Quote
Old 2011-07-13, 16:02   #1066
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

65358 Posts
Default

Quote:
Originally Posted by apsen View Post
Q6600
That's your problem. I also have a Q6600 and the multi-core performance is horrible. If you check your Prime95 performance on 1/2/3/4 cores you'll notice it will also scale badly -- as you load more cores the throughput of each drops.
James Heinrich is offline   Reply With Quote
Old 2011-07-13, 17:07   #1067
apsen
 
Jun 2011

131 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
I also have a Q6600 and the multi-core performance is horrible. If you check your Prime95 performance on 1/2/3/4 cores you'll notice it will also scale badly -- as you load more cores the throughput of each drops.
What is the best overall performance? Use 3 cores and leave one idle?
apsen is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 13:19.


Mon Aug 2 13:19:13 UTC 2021 up 10 days, 7:48, 0 users, load averages: 2.27, 2.09, 2.00

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.