![]() |
|
|
#1057 |
|
Jun 2011
131 Posts |
I've went through the whole thread and the latest version that worked for me was 0.8. The later versions seem to be compiled for cc1.1 or higher. But when I tried to compile 0.8 for Win64on my own I'm getting "cudaStreamCreate() failed". What may I be doing wrong?
|
|
|
|
|
|
#1058 | |
|
Dec 2010
Monticello
5×359 Posts |
Quote:
At a guess, you need a copy of cudart.dll in the same directory as your executable and/or current working directory. See Rodrigo's thread for the pointers. At a second guess, your card can only support one stream at a time, so try telling it to use only one stream. To really know, you will need to get the error code from cudaStreamCreate, which will require you to program a bit. You will then have to go look it up in the Nvidia documentation. I can modify the code, but can't compile for Win32. Let me know if I need to do that. |
|
|
|
|
|
|
#1059 | ||
|
Jun 2011
100000112 Posts |
Quote:
Quote:
To address your specific points: Yes I have cudart.dll - without it the program will not even start. The downloaded 0.8 works with 3 streams. It just occured to me that I may need to use even older CUDA toolkit... Maybe kjaget could chime in... BTW 0.8 is posted in post #280. Last fiddled with by apsen on 2011-07-06 at 13:36 Reason: Additional information |
||
|
|
|
|
|
#1060 |
|
Dec 2010
Monticello
5·359 Posts |
10200 = 27D8....you sure you have the right return-type declared for cudaStreamCreate?
If you are just trying to run mfaktc, I'd be inclined to ignore the "I can't build it" problem. What do you hope to do with the modification? |
|
|
|
|
|
#1061 | |
|
"Oliver"
Mar 2005
Germany
21278 Posts |
Quote:
1st batch
2nd batch
I feel comfortable with these results. ![]() These runs included 300+ "no factor results" in a row aswell as "5 factors from ~50 assignments" Oliver Last fiddled with by TheJudger on 2011-07-09 at 13:23 |
|
|
|
|
|
|
#1062 | |
|
"Lucan"
Dec 2006
England
2×3×13×83 Posts |
Quote:
the hypothesis that you conducted 4045 independent trials, the probability of a "success" being 1/69. (See the Kamasutra). 2 factors in one trial? 1/692 = 1/4761. Found one. Tick. Expected 60 successes +/- 8. Found 62. Tick. Expected number of "gaps">300 = (68/69)300 *60 = 0.75 Probabity of no such gaps e^-0.75 = 0.47. Found one. Tick. Probability of a gap <28 ~1/3. For 4 such gaps in succession, expected total gap ~50. Probability of 4 or more such gaps in a row =1/81 Found one in 60. Tick. Thoughtful comments on this analysis welcome. David Last fiddled with by davieddy on 2011-07-10 at 03:37 |
|
|
|
|
|
|
#1063 |
|
"Lucan"
Dec 2006
England
2×3×13×83 Posts |
|
|
|
|
|
|
#1064 | |
|
"Lucan"
Dec 2006
England
2×3×13×83 Posts |
Quote:
Each of them has a 1/81 chance of being followed by 4+ "short" gaps. So expected runs of 4+ short gaps is 0.5 This might seem a strange way to approach the question of finding 5 factors in ~50 tests, but the Poisson distribution tells us that if we expect 50/69 factors in a randomly selected 50 tests, the probability of 5+ factors is 0.0009. 4045/50 = 81, so this way we would expect 0.0729 occurrences of 5 ln 50. It is clear to see why this underestimates the likelihood of finding some run of 5 factors in 50 tests, but very hard to see how to adjust it. Note that this problem is the same as judging how lucky GIMPS has been to find 7 "short" gaps in a row between Mprimes. In this case, 50% of gaps are short and 50% are long (the boundary being an exponent ratio of ~1.3) My conclusion is that we expect 1 such run in 256 Mprimes: lucky yes, outrageous no. Pretty sure it should be 1/2! * 1/692 (Poisson again) David Last fiddled with by davieddy on 2011-07-10 at 08:37 |
|
|
|
|
|
|
#1065 |
|
Jun 2011
131 Posts |
![]() On my 4 core system mfaktc 0.17 performance suffer if something is running on other cores. I do not see that effect on 2 core system with mfaktc 0.8. For example: I run only mfaktc 0.17 (on core #4) i get about 93M/s. If I'll start one Prime95 worker on another core (say #1) it drops to about 75M/s. If I'll start two Prime95 workers on another core (say #1 and #2) it drops to about 69M/s. If I'll start three Prime95 workers on another core (say #1, #2 and #3) it drops to about 35M/s. Forth worker has almost no effect as mfaktc runs at higher priority. Win 7 x64, Q6600, GTX 465, mfaktc 0.17. The other computer (Win 7 x64, E5200, 8800 GTS, mfaktc 0.8) has a consistent output of about 26M/s no matter whether Prime95 is running or not.
Last fiddled with by apsen on 2011-07-13 at 15:33 |
|
|
|
|
|
#1066 |
|
"James Heinrich"
May 2004
ex-Northern Ontario
23×149 Posts |
|
|
|
|
|
|
#1067 |
|
Jun 2011
2038 Posts |
What is the best overall performance? Use 3 cores and leave one idle?
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1676 | 2021-06-30 21:23 |
| The P-1 factoring CUDA program | firejuggler | GPU Computing | 753 | 2020-12-12 18:07 |
| gr-mfaktc: a CUDA program for generalized repunits prefactoring | MrRepunit | GPU Computing | 32 | 2020-11-11 19:56 |
| mfaktc 0.21 - CUDA runtime wrong | keisentraut | Software | 2 | 2020-08-18 07:03 |
| World's second-dumbest CUDA program | fivemack | Programming | 112 | 2015-02-12 22:51 |