![]() |
[QUOTE=nucleon;243112]I'd go a AMD x6 and nvidia chipset mboard, and just throw more cores at the problem.
-- Craig[/QUOTE] But that won't work because mfaktc's "small" prime sieve is not multithreaded. Someone tried multithreading it with a library earlier, with not-so-good results; but it could be done better. Or better still would be if [thread=11900]the sieve could be put on the GPU[/thread]. |
Hi moebius,
[QUOTE=moebius;243089]I meant of course 12 M/sec for 77M exponents 2^65 to 2^66 with CPU Athlon64@2.5 GHz. And what for a Super-processor is at all worthy of a GTX 570.[/QUOTE] I have no AMD benches for the sieve part. For a GTX 570 I think you'll need at least ~6GHz i7 single core equivalent (e.g. two cores running at 2.5GHz) to keep the GPU busy with SievePrimes=5000. Desirable would be 8-9GHz i7 single core equivalent to keep SievePrimes at ~30k. And you'll need 13-14GHz i7 single core equivalent to keep SievePrimes at 100k. In my opinion the "sweet spot" for SievePrimes is somewhere around 20k-30k. Oliver P.S. when running multiple instances of mfaktc (using multiple CPU cores) each instance will do an own assignement. |
At some point it must make more sense to put the remaining CPU cycles to work trial factoring (or some other type of work), rather than to sieve just a little more for the GPU, especially with such diminishing returns. Can someone run some benchmarks to measure the candidates / second when sieving to 1000, 2000, etc., 100000? I don't think the CPU type really matters as the shape of the graph would likely be similar for all CPUs. From what Judger has just posted the relationship does not appear to be simply linear (or logarithmic) with sieve depth.
If there's a graph, and it's possible to fit a reasonably accurate curve onto it, then it should be possible for anyone to predict their CPUs candidates / second output for any sieve depth after taking just one or two measurements of their own, or even better, perhaps the software could do it automatically to further optimise throughput. As for GPU efficiency, that is readily predictable(*), so here is a table for every 5000 increase in sieveprimes with the corresponding percentage of candidates remaining and what that means for the speed of the job on the GPU (relative to sieving to just 5000 primes):[code] sieve | % candidates | % Speed |[color=red] % candidates | % Speed[/color] depth | remaining | Increase |[color=red] remaining | Increase[/color] --------|---------------|-----------|[color=red]---------------|-----------[/color] 5000 | 3.291 | 0 |[color=red] 2.601 | 0[/color] 10000 | 3.044 | 8.099 |[color=red] 2.428 | 7.112[/color] 15000 | 2.917 | 12.803 |[color=red] 2.338 | 11.250[/color] 20000 | 2.833 | 16.155 |[color=red] 2.278 | 14.175[/color] 25000 | 2.771 | 18.779 |[color=red] 2.233 | 16.440[/color] 30000 | 2.722 | 20.892 |[color=red] 2.199 | 18.287[/color] 35000 | 2.682 | 22.720 |[color=red] 2.170 | 19.847[/color] 40000 | 2.648 | 24.273 |[color=red] 2.146 | 21.196[/color] 45000 | 2.619 | 25.663 |[color=red] 2.125 | 22.386[/color] 50000 | 2.594 | 26.880 |[color=red] 2.107 | 23.448[/color] 55000 | 2.571 | 27.990 |[color=red] 2.090 | 24.409[/color] 60000 | 2.550 | 29.035 |[color=red] 2.076 | 25.286[/color] 65000 | 2.533 | 29.939 |[color=red] 2.062 | 26.091[/color] 70000 | 2.516 | 30.794 |[color=red] 2.050 | 26.837[/color] 75000 | 2.500 | 31.624 |[color=red] 2.039 | 27.531[/color] 80000 | 2.486 | 32.380 |[color=red] 2.029 | 28.180[/color] 85000 | 2.473 | 33.089 |[color=red] 2.019 | 28.789[/color] 90000 | 2.460 | 33.754 |[color=red] 2.010 | 29.363[/color] 95000 | 2.449 | 34.397 |[color=red] 2.002 | 29.906[/color] 100000 | 2.438 | 34.999 |[color=red] 1.994 | 30.420[/color][/code](*)I'm not entirely sure what sieveprimes denotes, which is why there's a red section on the table. If sieveprimes means that the sieve runs through all primes less than some value, then look at the black columns. If sieveprimes means that the sieve runs through the first however many primes, then look at the red columns. |
[QUOTE=Ken_g6;243115]But that won't work because mfaktc's "small" prime sieve is not multithreaded. Someone tried multithreading it with a library earlier, with not-so-good results; but it could be done better. Or better still would be if [thread=11900]the sieve could be put on the GPU[/thread].[/QUOTE]
I was referring to running more instances of mfaktc. -- Craig |
Hi!
[QUOTE=lavalamp;243149]At some point it must make more sense to put the remaining CPU cycles to work trial factoring (or some other type of work), rather than to sieve just a little more for the GPU, especially with such diminishing returns.[/QUOTE] I think it doesn't make much sense to do some TF on CPU in this configuration. Depending of the exponent and factor sizes a GTX 570 can do around 200GHzd per day. If adding another 3GHz CPU core yields only 5 percent more throughput for mfaktc this 3GHz core is worth 10GHzd per day! I don't have exact number for TF on CPU using prime95 but I would assume that a 3GHz core can't do 10GHzd per day. :wink: Of course don't put to many cores on TF, do LL instead! [QUOTE=lavalamp;243149](*)I'm not entirely sure what sieveprimes denotes, which is why there's a red section on the table. If sieveprimes means that the sieve runs through all primes less than some value, then look at the black columns. If sieveprimes means that the sieve runs through the first however many primes, then look at the red columns.[/QUOTE] The second option (actually the number of odd primes used for sieving). Oliver |
[QUOTE=TheJudger;241282]Hi vsuite,
... [CODE] 2^1 to 2^64: 1m 7.489s (GPU load ~60%) 2^64 to 2^65: 1m 4.429s (GPU load ~60%) 2^65 to 2^66: 1m 44.854s (GPU load 60-70%) 2^66 to 2^67: 3m 6.242s (GPU load 70-75%) [/CODE] Oliver[/QUOTE] Just to confirm, mfacktc reports that 66362159 has 3 factors in the 1-64 range, so I guess it is set to not terminate after finding factors, right? |
Hi,
[QUOTE=vsuite;243366]Just to confirm, mfacktc reports that 66362159 has 3 factors in the 1-64 range, so I guess it is set to not terminate after finding factors, right?[/QUOTE] depends on what you're doing. If your worktodo.txt looks like this: [CODE] Factor=bla,66362159,1,64 Factor=bla,66362159,64,65 Factor=bla,66362159,65,66 Factor=bla,66362159,66,67 Factor=bla,66362159,67,68 Factor=bla,66362159,68,69 Factor=bla,66362159,69,70 [/CODE] than mfaktc will TF M66362159 from 2^1 to 2^70 no matter if it finds one or more factors in the first step. If you worktodo.txt looks like this: [CODE] Factor=bla,66362159,1,70 [/CODE] than it depends on a setting in mfaktc.ini [CODE] # possible values for StopAfterFactor: # 0: Do not stop the current assignment after a factor was found. # 1: When a factor was found for the current assignment stop after the # current bitlevel. This makes only sense when Stages is enabled. # 2: When a factor was found for the current assignment stop after the # current class. StopAfterFactor=1 [/CODE] In this case mfaktc will stop the TF attempt of M66362159 at some earlier bitlevel. The smallest bitlevels are merged together to a single run (in this case in the default configuration of mfaktc the first run will be M66362159 from 2^1 to 2^67). Oliver |
My manual trial factoring assignments are in the range 3323xxxxx, 71, 72. What is the optimal range for these 71,72 or 71,73 or 71,74 or what? Will mfaktc merge any of the other bit levels? Thanks.
|
[QUOTE=vsuite;243686]My manual trial factoring assignments are in the range 3323xxxxx, 71, 72. What is the optimal range for these 71,72 or 71,73 or 71,74 or what? Will mfaktc merge any of the other bit levels? Thanks.[/QUOTE]
Depending on your level of patience, I would suggest that you use this link: [URL="http://v5www.mersenne.org/report_factoring_effort/?exp_lo=332192831&exp_hi=332249999&bits_lo=0&bits_hi=76&txt=1&exassigned=1&B1=Get+Data"]Free exponents at 76 and below[/URL] to get the first available exponent in the range and then take it to at least 78 if not higher. In the manual check out page you can restrict the exponent range that you request work from. That will allow you to get the expo that you choose. Then don't turn in your results until it is finished all the way. |
Sticky thread?
I suggest a "The latest mfaktc software" thread like [URL="http://www.mersenneforum.org/showthread.php?t=2"]this one[/URL] for Prime95. It will help finding the correct binaries.
Maybe someone can also write a sticky guideline "What can I do with my GPU for PrimeNet"? Nvidia: Mfaktc (TF), CUDALucas (LL, only Linux?), gpuLucas (LL, not released) ATI: nothing, goto BOINC (Collatz conjecture) I don't want to open this discussion here but we have to think of what happens if a Mersenne prime is someday found on a GPU (new thread). All this sw is non Prime95/MPrime but uses PrimeNet infrastructure. I'm looking forward to joining when the GTX 560 is released. |
[QUOTE=Brain;244237]I suggest a "The latest mfaktc software" thread like [URL="http://www.mersenneforum.org/showthread.php?t=2"]this one[/URL] for Prime95. It will help finding the correct binaries.
Maybe someone can also write a sticky guideline "What can I do with my GPU for PrimeNet"? Nvidia: Mfaktc (TF), CUDALucas (LL, only Linux?), gpuLucas (LL, not released) ATI: nothing, goto BOINC (Collatz conjecture)[/QUOTE]I think that this is the right thing to do. |
| All times are UTC. The time now is 23:01. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.