mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2011-12-03, 21:34   #12
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2·112·47 Posts
Default

Quote:
Originally Posted by Dubslow View Post
Keep in mind that those who are doing mfakt* are doing so a very little impact to what their LL throughput was before. I have four cores; nominally all four do LL. Currently, one of them does mfaktc, which (usually) finds a factor every couple of days; that eliminates exponents much more quickly than using the core for LL. I realize that we can't factor everything, but I'm using my GPU at relatively little cost to my LL throughput. mfakt* hurts CUADLucas throughput, yes, but I would argue that even if we all ran CUDALucas instead of mfakt*, the total increase in LL throughput would be very very slight, and I'd make a wager that our impact would be less than 1% overall. Clearly mfakt* has a much bigger impact of TF and eliminating exponents.
If I may throw in my two cents worth in... In reference to the GPU to 72 Overall System Progress report...

Even though the conversion coefficient for GPU days to "Normalized GHz Days" continues to be debated, I think it is clear that at the LL "wave-back" GPUs are still better utilized doing TF than LL in the current range and "bit" level at this point in time. This may no longer be the case at the DC "wave-back".

And I can tell you with some authority that those taking true DC and LL work from the system are mostly using CPUs to do the work.

I think we're actually arguing somewhat in a vacuum. As in, we all have our opinions, but we're (mostly) not working from hard data.

I am going to add to the G72 system the ability for individual workers to enter empirical data for their individual systems (CPUs and GPUs) for hardware ability and "wall-clock" times for the various work types and methodologies. With enough data, we should be in a better position to determine exactly where the cost/benefit curves cross.

But I would argue strongly against changing the fundamental GHzDays metric already in use by Prime95 and PrimeNet. After all, a modern CPU can do orders of magnitude more work per day than (for example) an old Pentium. Should the P1 receive the same credit as a modern CPU?

Or, should the same CPU receive different GHzDays credit for the same "wall-clock time" for different work types? Doing so would simply distort the "economic signals" as to what work type had the most benefit.
chalsall is offline   Reply With Quote
Old 2011-12-03, 23:23   #13
lycorn
 
lycorn's Avatar
 
"GIMFS"
Sep 2002
Oeiras, Portugal

2·7·113 Posts
Default

Quote:
Originally Posted by chalsall View Post
But I would argue strongly against changing the fundamental GHzDays metric already in use by Prime95 and PrimeNet. After all, a modern CPU can do orders of magnitude more work per day than (for example) an old Pentium. Should the P1 receive the same credit as a modern CPU?

Or, should the same CPU receive different GHzDays credit for the same "wall-clock time" for different work types? Doing so would simply distort the "economic signals" as to what work type had the most benefit.
I fully agree. The point is we need a meaningful unit of measure to allow us to quantify the amount of work done. For that matter, GHz-days is as good a reference as MHz-weeks or GHz-months could be.
If a modern CPU can test say 2 exponents in 10 days, whereas an old one can test just one, that means the modern CPU has to receive twice the credit for the same "wall clock" time of work. And this, regardless of the improvement being due to an increase in clock frequency, or in cache size, or to other architectural enhancements, or the use of new instructions not available in the old CPU. Same goes with GPUs: if a GPU can trial factor numbers 100x faster than a CPU, it obviously has to get 100 times more credit after they both have been working on TF for a certain amount of time. If we artificially change the metrics, we will be comparing apples and oranges, and the whole system of credit loses its meaning.

Last fiddled with by lycorn on 2011-12-03 at 23:25
lycorn is offline   Reply With Quote
Old 2011-12-04, 02:12   #14
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

17·487 Posts
Default

Quote:
Originally Posted by PsycO View Post
So, if Prime95 read this little complaints, i hope something will be done to calm my consciousness...
I read many of the forum posts. I'd argue that there is no "correct" solution. If I change the formulas then those that are doing TF using a CPU are penalized.

Perhaps, the solution is to no longer have an overall Top Producers report which mixes CPU credits from hardware platforms that are not really comparable. Rather, we could support only the Top LL, Top Double-check, and Top TF reports.

In any event, remember the main goal is to generate useful mathematical results, not CPU credits.
Prime95 is offline   Reply With Quote
Old 2011-12-04, 11:43   #15
Brain
 
Brain's Avatar
 
Dec 2009
Peine, Germany

331 Posts
Default gpuLucas

Maybe Andrew Thall's new gpuLucas will solve the current "problem" with a super-fast-LL-implementation producing dozens of GHz-days per minute...
Brain is offline   Reply With Quote
Old 2011-12-04, 13:51   #16
nucleon
 
nucleon's Avatar
 
Mar 2003
Melbourne

5·103 Posts
Default

I find this recalibration of GHz-days metric to be a non-discussion. If we change things now, we may have to change it all back with say a next generation GPU card comes out with similar LL throughput as TF throughput.

Using a GTX560Ti as a benchmark:

LL Test - 11 GHz-days/day (source: GPU FAQ pdf)
TF - 175 GHz-days/day
(source: Last 10 day avg on my pc 2x 2.9Ghz i7 930 cores with sieveprimes=5000, GPU Load=99%)

Taken from:
http://mersenne-aries.sili.net

Using M46447963 as a guide. (Why this number? - it just happens to be one of the exponents I'm currently TF-ing :)

TF 41.1 GHz-days (64 to 74bits) 13.215% chance of factor
LL 80.6 GHz-days (+80.6 GHz-days for double check)

Take these resources:
2x 2.9GHz i7 930 cores
1x 560TI GPU

LL throughput: 11 + 2x3 (ish) = 17GHz-days per day
TF throughput: 175 GHz-days/day


So given enough exponents around this range, we'd need to TF approx 7.4 exponents to find a conclusive result, i.e. 1/0.13215, or put it another way 7.4 x 41.1 = 304.14 GHz-days worth of TF work is equivalent of 2x LL tests (161.2GHz days)

In my mind it's mathematically equivalent to find a factor of one exponent as it does to find matching non-zero LL residues.

This equates to, when using the above resources:
TF: 1.74 "real days" (using 175GHz-days/day)
LL: 9.48 "real days" (using 17GHz-days/day)

So TF on GPU is 5.45x more efficient than doing LL tests with the equivalent resources in the above example. YMMV.

To maximize your resources:

If you have sufficient matching CPU resources to your available GPU resources - run mfaktc.

On your remaining GPU resources - run LL tests on the remaining GPU resources.


-- Craig
nucleon is offline   Reply With Quote
Old 2011-12-04, 22:16   #17
PsycO
 
PsycO's Avatar
 
Nov 2011
Quebec, Canada

32 Posts
Default

Quote:
Originally Posted by nucleon View Post
...
To maximize your resources:

If you have sufficient matching CPU resources to your available GPU resources - run mfaktc.

On your remaining GPU resources - run LL tests on the remaining GPU resources.


-- Craig
This is exactly what i do now... I run 1 mfaktc on my gt-430 bind to 1 cpu, and an another one on my gts-450 bind to the second core. As i only have 2 core, i can't fill my gts-450 with job, even if i only sieve 5000... So i run cudalucas on it to keep it busy.

For my previoux comments about changing the ghz/days ratio for LL, i was wrong... I not taken into account the multiples variables of this problem...

Quote:
Originally Posted by Prime95 View Post
...
Perhaps, the solution is to no longer have an overall Top Producers report which mixes CPU credits from hardware platforms that are not really comparable. Rather, we could support only the Top LL, Top Double-check, and Top TF reports.
The Top Producers report stimulate competitions, so putting cpu with cpu, gpu with gpu on two page is the best option! The report already know from what software the result came from, so it's just a matter of adding a page a split them in their respective place! But if you don't want to do the job of splitting the things and add a page, maybe it's simply better to leave it this way...
PsycO is offline   Reply With Quote
Old 2011-12-05, 02:11   #18
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

17·487 Posts
Default

Quote:
Originally Posted by PsycO View Post
The Top Producers report stimulate competitions, so putting cpu with cpu, gpu with gpu on two page is the best option! The report already know from what software the result came from, so it's just a matter of adding a page a split them in their respective place! But if you don't want to do the job of splitting the things and add a page, maybe it's simply better to leave it this way...
The LL, double-check, etc reports already exist. The TF report shows TF credits from both GPU and CPU.
Prime95 is offline   Reply With Quote
Old 2011-12-05, 02:55   #19
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

11100001101012 Posts
Default

I think he meant putting CPU TF on one page, and GPU TF on another. Which results are which should be easy to determine, even retroactively; anything done in gpu is tagged with mfakt* somewhere in the results line.
Dubslow is offline   Reply With Quote
Old 2011-12-05, 05:12   #20
Xyzzy
 
Xyzzy's Avatar
 
Aug 2002

2×32×13×37 Posts
Default

The current system seems to work well enough.

Who wouldn't trade all of their accumulated stats for a Mersenne prime?
Xyzzy is offline   Reply With Quote
Old 2011-12-05, 07:12   #21
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by Xyzzy View Post
The current system seems to work well enough.

Who wouldn't trade all of their accumulated stats for a Mersenne prime?
Well said.
Dubslow is offline   Reply With Quote
Old 2011-12-05, 11:34   #22
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

21338 Posts
Default

Quote:
Originally Posted by PsycO View Post
The Top Producers report stimulate competitions, so putting cpu with cpu, gpu with gpu on two page is the best option!
Well, we should distinguish CC 1.x and CC 2.x GPU stats, too because CC 2.x is so much faster for as CC 1.x GPUs... that is not fair so we need additional stats! And for ATI/AMD GPUs we need another page, too!</irony>

My 2 cents: Same job => same credit, same stats! (or no credits/stats at all...) No matter which hardware was used and how much real time was needed for computation.

Oliver
TheJudger is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
ivy bridge versus haswell diep Hardware 29 2017-12-06 13:43
mfaktc and CUDALucas side-by-side TObject GPU Computing 2 2012-07-21 01:56
Windows 32bit vs 64bit mfaktc/cudalucas bcp19 GPU Computing 20 2012-03-11 01:24
NTT transform at (AMD) GPU versus *lucas diep GPU Computing 11 2011-05-11 20:27
Head versus tail R.D. Silverman Lounge 9 2008-12-16 14:28

All times are UTC. The time now is 15:09.


Fri Jul 7 15:09:19 UTC 2023 up 323 days, 12:37, 0 users, load averages: 1.00, 1.12, 1.14

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔