![]() |
|
|
#1079 |
|
Jun 2011
131 Posts |
I originally tried to follow the changes since version 0.8 as it was the last stock version that worked on my machine so it is more probable that I broke something unintentionally.
I've now got clean mkfaktc-0.17 and reapplied my changes directly to it. I've made them almost minimal (with the exception of enabling it to compile under CUDA 2.2) so it should be easy to do diff. I do not see any test failures on it so please test it on your system to see if you still experience them. Also, the slowdown was due to synchronous memory copy in the main loop but on my machine it was not as noticeable. I was loosing less then 10% of performance so I was going to look into it later. I have now reworked it too and performance is back on par with stock build. Please check if I did it the right way. |
|
|
|
|
|
#1080 | ||
|
"Oliver"
Mar 2005
Germany
11·101 Posts |
Hi aspen,
Quote:
Quote:
Oliver |
||
|
|
|
|
|
#1081 | ||
|
Jun 2011
131 Posts |
Quote:
Quote:
It was late yesterday so I didn't really do it this time. Is there a problem with 2.2? Last fiddled with by apsen on 2011-07-18 at 12:13 |
||
|
|
|
|
|
#1082 |
|
"Oliver"
Mar 2005
Germany
21278 Posts |
I don't know any problems related to mfaktc (expect that is doesn't compile as it is now). As CUDA 2.2 is past I don't have any real plans for supporting it in mfaktc. If the needed changes are trivial and have not side effect I might try it anyway.
Oliver |
|
|
|
|
|
#1083 | |
|
Jun 2011
131 Posts |
Quote:
I've made the changes but I wouldn't be able test it until later today. But maybe you'd be willing to take a look at it before then to see if my understanding is right. (If you are on Germany time it will be past midnight for you before I get a chance to test.) |
|
|
|
|
|
|
#1084 | |
|
Jun 2011
8316 Posts |
Quote:
![]() Just to make sure I understand it right: just blindly replacing atomics with unprotected access to d_RES might result in the problem only when we find more then one factor per class (tf_class_* call) and even then it will report that at least one factor has been found but the factor(s) itself may be scrambled by simultaneous attempt to store them in the result array. So if the program reports no factors found - it will be true. Is this correct? |
|
|
|
|
|
|
#1085 |
|
"Oliver"
Mar 2005
Germany
21278 Posts |
correct!
|
|
|
|
|
|
#1086 |
|
Jun 2011
131 Posts |
Is CPUStreams configuration parameter basically the length of sieve queue?
|
|
|
|
|
|
#1087 |
|
"Oliver"
Mar 2005
Germany
11·101 Posts |
yes!
btw.: pleases change the version string in your modified code to something unique. e.g. "0.17-ap1" Oliver Last fiddled with by TheJudger on 2011-07-18 at 20:15 |
|
|
|
|
|
#1088 | |
|
"Oliver"
Mar 2005
Germany
11×101 Posts |
Hi Eric,
Quote:
Oliver |
|
|
|
|
|
|
#1089 |
|
Jun 2011
100000112 Posts |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1676 | 2021-06-30 21:23 |
| The P-1 factoring CUDA program | firejuggler | GPU Computing | 753 | 2020-12-12 18:07 |
| gr-mfaktc: a CUDA program for generalized repunits prefactoring | MrRepunit | GPU Computing | 32 | 2020-11-11 19:56 |
| mfaktc 0.21 - CUDA runtime wrong | keisentraut | Software | 2 | 2020-08-18 07:03 |
| World's second-dumbest CUDA program | fivemack | Programming | 112 | 2015-02-12 22:51 |