![]() |
|
|
#1618 | ||
|
Nov 2010
Germany
25516 Posts |
Quote:
Regarding the pure question "why not lower SievePrimes": because it's not tested. As long as SievePrimes remains above SIEVE_SPLIT (250), the code will probably work correctly. However, before mfakt[co] is given to the public, some more tests are performed. You may have seen the CHECKS_MODBASECASE #define. Completing the full selftest at a certain SievePrimes value is a prerequisite, but not sufficient. OK, now that this is out I can tell you that some time ago I had finished a few tests with CHECKS_MODBASECASE at SievePrimes=256, and that did not show up any error, but it was not the full test, and it was mfakto, which runs different kernels. And I briefly tested a special version of mfakto that skips sieving and memory transfer to the GPU completely, testing all candidates of a class. This one also successfully completed a few rounds of CHECKS_MODBASECASE tests. So I'm quite confident that the full tests of lower SievePrimes would pass, but so far nobody has done these tests. Quote:
My first approach was so terribly slow that I could only sieve the first 256 primes to get the 30M/s (sieve output for the siever alone) on a GPU that otherwise runs ~150M/s through factoring. I was so disappointed that I gave up ... Your results, however ... |
||
|
|
|
|
|
#1619 | |
|
Nov 2010
Germany
3×199 Posts |
Quote:
Last fiddled with by Bdot on 2012-03-05 at 20:19 Reason: if I had asked |
|
|
|
|
|
|
#1620 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3·29·83 Posts |
Perhaps call it Eq. GHz, for Equivalent GHz (to a Core2).
@rcv: What a wonderful defense I like where this is going. I'd volunteer to do some 460 testing, but unfortunately I cannot for the life of me upgrade my Linux Nvidia drivers past 270.xx, so I'm stuck with mfaktc 0.17.
|
|
|
|
|
|
#1621 |
|
"Jerry"
Nov 2011
Vancouver, WA
1,123 Posts |
|
|
|
|
|
|
#1622 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
753710 Posts |
|
|
|
|
|
|
#1623 | |
|
Nov 2010
Germany
3×199 Posts |
Quote:
2. Run the full selftest with that binary with fixed, low SievePrimes 3. Analyze the output for any CHECKS_MODBASECASE violations (and of course, all factors need to be found). But before we all run and do something just for the sake of doing something: Maybe go back and think again about "Why do we want lower SievePrimes?" and "Do we really want that?". I haven't quite understood that part yet. Last fiddled with by Bdot on 2012-03-05 at 23:28 Reason: fix SievePrimes |
|
|
|
|
|
|
#1624 | ||
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
11100001101012 Posts |
Quote:
Quote:
Last fiddled with by Dubslow on 2012-03-06 at 02:57 Reason: (My emphasis.) |
||
|
|
|
|
|
#1625 |
|
Jun 2005
12910 Posts |
Not sure which side you meant the benefit was, but here's my take.
Considering how much more efficient GPUs are at generating GHz-days of work, trading 100% of 1 CPU for a 14% speedup in GPU production feels like a net win to me. If you choose to measure it that way, of course. For a 560ti, that's ~25 GHz-days / day extra throughput. I don't see that a CPU is going to give anywhere that kind of throughput doing other types of work. Last fiddled with by kjaget on 2012-03-06 at 04:21 |
|
|
|
|
|
#1626 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3×29×83 Posts |
No, it's about half a CPU core, there'd still be significant CPU use (unless he pushes the GPU sieve thing through as well, which I'm hoping for).
|
|
|
|
|
|
#1627 | |
|
"Oliver"
Mar 2005
Germany
45716 Posts |
Hi rcv,
Quote:
Oliver |
|
|
|
|
|
|
#1628 | |
|
Jun 2011
131 Posts |
Quote:
And low priority does not always help. I have not specifically checked mfaktc but prime95 would cause hiccups in some programs even on idle priority. In fact IIRC it has setting to pause processing when it detects specified programs running. Last fiddled with by apsen on 2012-03-06 at 14:23 |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1676 | 2021-06-30 21:23 |
| The P-1 factoring CUDA program | firejuggler | GPU Computing | 753 | 2020-12-12 18:07 |
| gr-mfaktc: a CUDA program for generalized repunits prefactoring | MrRepunit | GPU Computing | 32 | 2020-11-11 19:56 |
| mfaktc 0.21 - CUDA runtime wrong | keisentraut | Software | 2 | 2020-08-18 07:03 |
| World's second-dumbest CUDA program | fivemack | Programming | 112 | 2015-02-12 22:51 |