![]() |
You are asking too many questions :razz:. Install Misfit. (it works for win64 too, very well). Configure user names in Misfit (very easy!). You don't need it configured in mfaktc. Generaly, go without user name, dates, times, etc in mfaktc: it makes report files shorter and easier to pass on to the server. Misfit takes care of the rest.
About printing every class: Use PrintMode=1 in the mfaktc.ini to overprint the lines [edit: it needs restarting mfaktc] (yes, it prints every class, and for a 70-71 bit assignment this could mean every second, you can not change this). If still bothersome, play with ProgressFormat line (like make it void) or redirect in a >nul device. But usually PrintMode=1 will do the trick. When you work @74 bits, each class takes (tens of) minutes to print. Congrats for the card! It does the same job as one of my (water cooled) 580s, at a fraction of the wattage. Good choice! |
NEVER too many questions....hahaha
[QUOTE=LaurV;388458]You are asking too many questions :razz:. Install Misfit. (it works for win64 too, very well). Configure user names in Misfit (very easy!). You don't need it configured in mfaktc. Generaly, go without user name, dates, times, etc in mfaktc: it makes report files shorter and easier to pass on to the server. Misfit takes care of the rest.
About printing every class: Use PrintMode=1 in the mfaktc.ini to overprint the lines [edit: it needs restarting mfaktc] (yes, it prints every class, and for a 70-71 bit assignment this could mean every second, you can not change this). If still bothersome, play with ProgressFormat line (like make it void) or redirect in a >nul device. But usually PrintMode=1 will do the trick. When you work @74 bits, each class takes (tens of) minutes to print. Congrats for the card! It does the same job as one of my (water cooled) 580s, at a fraction of the wattage. Good choice![/QUOTE] Thanks for the answers so far....less than MP44 questions to go.... It seems MISFIT IS NOT the same as the Submission Spider Chris talks about? Do they do the same things? Compete? Complement each other? Is this Capitalism/competition at its finest? Should I read more and ask less questions????? I vote "YES". |
They do the same thing, but Misfit is doing it much better, unfortunately it only works on windoze (it uses some .net stuff or so). New developments which I am not aware of, may include *nix, but I don't believe. If you windoze, take misfit. You can easily compile it if you are skeptical, sources are available. Submission spider is (are some) python/perl/bash/batch files which run on alien stuff like Linux :wink:. You don't need that for windows.
|
Chris' submission spider on the GPU72 site is a perl script. It does the same job as Misfit, but works on all platforms. There is also mfloop.py, a python which I use (and hack on). mfloop.py is probably the best tool for non-Windows systems.
But in your case, just use Misfit :) |
1 Attachment(s)
oh, c'mon boyar!
[ATTACH]12016[/ATTACH] p.s. as seen in the header, the visualization site is working, so it is not the server (I assume they are physically the same computer?) |
500 Internal Server Error
[QUOTE][B]Internal Server Error[/B] The server encountered an internal error or misconfiguration and was unable to complete your request. Please contact the server administrator, [email]chalsall@ideas4lease.com[/email] and inform them of the time the error occurred, and anything you might have done that may have caused the error. More information about this error may be available in the server error log. [/QUOTE] This is from the Overall individual statistics page. |
The server still responds, just slowly.
|
[QUOTE=petrw1;388442]Benchmark submitted[/QUOTE]Thanks. And it's in line with expected. 388GHd/d is at stock clock of 1050MHz, your is running at 1392MHz, so 388 * (1392 / 1050) = 514GHd/d
|
[QUOTE=James Heinrich;388489]Thanks. And it's in line with expected. 388GHd/d is at stock clock of 1050MHz, your is running at 1392MHz, so 388 * (1392 / 1050) = 514GHd/d[/QUOTE]
Interesting. 1392 MHz is well above any card's base clock, so it must be the boost clock. And if the boost clock can be sustained like that, the GTX 970 is an even better value. |
[QUOTE=James Heinrich;388489]Thanks. And it's in line with expected. 388GHd/d is at stock clock of 1050MHz, your is running at 1392MHz, so 388 * (1392 / 1050) = 514GHd/d[/QUOTE]
Good to know but I made no changes nor did I ask for any. I had the card added by a local shop. Now I did happen to mention I use this PC and GPU for extreme number crunching only...no games. So he may have boosted it for me but if so he did not ask or tell. I believe it is this card: [url]http://www.newegg.ca/Product/Product.aspx?Item=N82E16814125684[/url] GIGABYTE GV-N970G1 GAMING-4GD GeForce GTX 970 4GB 256-Bit GDDR5 PCI Express 3.0 HDCP Ready G-SYNC Support Video Card But it only has a BOOST clock of 1329. I suspect I typed it in wrong (transposed the last 2 digits). 514 * 1329 / 1392 = 490. Though with the window open over night and the room dropping to 55F it got as high as 509. |
[QUOTE=petrw1;388496]Good to know but I made no changes nor did I ask for any. I had the card added by a local shop. Now I did happen to mention I use this PC and GPU for extreme number crunching only...no games. So he may have boosted it for me but if so he did not ask or tell.[/QUOTE]
The boost is automatic. The card will boost if it's within the power and heat limitations. |
[QUOTE=Mark Rose;388497]The boost is automatic. The card will boost if it's within the power and heat limitations.[/QUOTE]
Cool...see a couple new notes I made above. |
[QUOTE=Prime95;388420]I changed the recycling SQL to expire DCs after 40 days of no contact if the computer has never returned an LL result.[/QUOTE]
Will this affect two tests I have running in the 70 M range? They are running on a laptop that has never returned results and it seems to have a lot of trouble contacting the server to update its progress. Currently PrimeNet indicates it is 4.6% done but the test is actually about 13% done. The apartment has working Internet and the laptop has contacted the server in the past so I'm not sure what the issue is... |
[QUOTE=petrw1;388496]But it only has a BOOST clock of 1329. I suspect I typed it in wrong (transposed the last 2 digits).[/QUOTE]You didn't transpose anything, I got it from your screenshots:
[img]http://i.imgur.com/MIT8W7f.gif[/img] [img]http://i.imgur.com/iSFsu4e.gif[/img] |
[QUOTE=Primeinator;388499]Will this affect two tests I have running in the 70 M range? They are running on a laptop that has never returned results and it seems to have a lot of trouble contacting the server to update its progress. Currently PrimeNet indicates it is 4.6% done but the test is actually about 13% done. The apartment has working Internet and the laptop has contacted the server in the past so I'm not sure what the issue is...[/QUOTE]
I only made the change to DC expirations. |
petrw1, can you send James a CudaLucas benchmark? I'd like to know the TF breakeven points for this card.
|
[QUOTE=James Heinrich;388504]You didn't transpose anything, I got it from your screenshots:
[img]http://i.imgur.com/MIT8W7f.gif[/img] [img]http://i.imgur.com/iSFsu4e.gif[/img][/QUOTE] Doing some reading it seems like many cards will exceed the boost clock if power budget and temperature allow for it. Neat! |
4 Attachment(s)
[QUOTE=Mark Rose;388507]Doing some reading it seems like many cards will exceed the boost clock if power budget and temperature allow for it. Neat![/QUOTE]There is quite a bit of headroom in that one picture. It isn't using close to 100% TDP and the temp is only 75°C. A program like [URL="http://www.evga.com/precision/"]Precision X[/URL] will allow additional tuning.
Here are samples from one of our cards: |
[QUOTE]
I believe it is this card: [URL="http://www.newegg.ca/Product/Product.aspx?Item=N82E16814125684"]http://www.newegg.ca/Product/Product...82E16814125684[/URL] [/QUOTE] I like the 3-fan Gigabyte coolers. My GTX 570 with the fans cranked, is not as loud as the Asus 580 at medium speed. |
MISFIT
1 Attachment(s)
So I took a leap of faith and downloaded, setup and ran MISFIT.
I go this error 27/11/2014 12:23:05 AM UID: detected in results. UNSUPPORTED FEATURE See attachment....seems to suggest it uploaded....or tried to upload??? |
UID = User ID
V5UserID and ComputerID have to be unspecified in mfaktc.ini EDIT: Additional information: It tried to send it, but did not because of the error. Everything's still there in your results.txt. All you need to do is delete all the instances of "UID: User/Computer, " such that the first word on the line is "no factor". You can easily do this with Ctrl+H in Notepad. |
[QUOTE=Jayder;388537]UID = User ID
V5UserID and ComputerID have to be unspecified in mfaktc.ini EDIT: Additional information: It tried to send it, but did not because of the error. Everything's still there in your results.txt. All you need to do is delete all the instances of "UID: User/Computer, " such that the first word on the line is "no factor". You can easily do this with Ctrl+H in Notepad.[/QUOTE] Thanks That would have been what I tried next if it wasn't so late.... However I was hoping I could keep in the ComputerID so that the results include it on PrimeNet instead of "Manual Testing" |
[QUOTE=Mark Rose;388387]Down to 1,068 yesterday :([/QUOTE]
Looks like we are pretty close according to this graph [URL]http://www.gpu72.com/graphs/dctf/week/[/URL] :smile: |
I just dumped in 1300GHd of work so that'll skew something today or tomorrow
|
[QUOTE=NickOfTime;388563]Looks like we are pretty close according to this graph [URL]http://www.gpu72.com/graphs/dctf/week/[/URL] :smile:[/QUOTE]
Chris was saying we need [url=http://www.mersenneforum.org/showpost.php?p=388330&postcount=3176]about 1400 per day[/url]. We're still burning through our buffer about 400 per day. Our buffer is now approximately 5674 exponents. James' 200ish exponents he just dumped may or may not be counted in that. George adjusted the newbie DCs to expire after [url=http://www.mersenneforum.org/showpost.php?p=388420&postcount=3188]40 days[/url], so that leaves us about 26 days before the recycling starts, which means we can be short about 200 exponents per day. So we still need another 1.8 THz-d/d, if we're giving out 1400 DC assignments per day for the next 26 days. |
[QUOTE=James Heinrich;388564]I just dumped in 1300GHd of work so that'll skew something today or tomorrow[/QUOTE]
Tonight. The stats at mersenne.info rollover at midnight UTC. :) |
[QUOTE=Mark Rose;388567]Chris was saying we need [url=http://www.mersenneforum.org/showpost.php?p=388330&postcount=3176]about 1400 per day[/url]. We're still burning through our buffer about 400 per day.
Our buffer is now approximately 5674 exponents. James' 200ish exponents he just dumped may or may not be counted in that. George adjusted the newbie DCs to expire after [url=http://www.mersenneforum.org/showpost.php?p=388420&postcount=3188]40 days[/url], so that leaves us about 26 days before the recycling starts, which means we can be short about 200 exponents per day. So we still need another 1.8 THz-d/d, if we're giving out 1400 DC assignments per day for the next 26 days.[/QUOTE] I have 0.5 of that started Tuesday evening....as soon as I fix MisFit they will be dumped. |
[QUOTE=Mark Rose;388567]Chris was saying we need [URL="http://www.mersenneforum.org/showpost.php?p=388330&postcount=3176"]about 1400 per day[/URL]. We're still burning through our buffer about 400 per day.
Our buffer is now approximately 5674 exponents. James' 200ish exponents he just dumped may or may not be counted in that. George adjusted the newbie DCs to expire after [URL="http://www.mersenneforum.org/showpost.php?p=388420&postcount=3188"]40 days[/URL], so that leaves us about 26 days before the recycling starts, which means we can be short about 200 exponents per day. So we still need another 1.8 THz-d/d, if we're giving out 1400 DC assignments per day for the next 26 days.[/QUOTE] Hmm, and I just added another 800ghz/d/d last night :-) |
[QUOTE=Mark Rose;388568]Tonight. The stats at mersenne.info rollover at midnight UTC. :)[/QUOTE]
Not quite correct, but I appreciate your observations. For the record, the mersenne.info spidering starts at midnight Barbados local time. Barbados doesn't do DST; thus, that's UTC-4 year round. Takes a little bit of time to fetch, and then some more time to process. Mersenne.info displays the actual updated date in the upper left-hand corner. |
[QUOTE=chalsall;388578]Not quite correct, but I appreciate your observations.
For the record, the mersenne.info spidering starts at midnight Barbados local time. Barbados doesn't do DST; thus, that's UTC-4 year round. Takes a little bit of time to fetch, and then some more time to process. Mersenne.info displays the actual updated date in the upper left-hand corner.[/QUOTE] And thank you for your corrections :) |
[QUOTE=NickOfTime;388570]Hmm, and I just added another 800ghz/d/d last night :-)[/QUOTE]
And just removed the 260x and install the 690 in it's place, so -170ghz, +620ghz = +450ghz/d Thou will now have to monitor gpu temps in that box since the 660ti is now shrouded, and I'll let the CPU's idle for now since I am not sure how close I am to the 850w PSU limit, I'll gradually spin them up and see what happens... |
Do you have a Kill-a-watt meter or similar? If you know the efficiency of your power supply, you can calculate how much you're drawing from it.
The box I'm typing this on is pulling 760 watts at the wall, but that's only ~625 watts from the 750 watt power supply. |
[QUOTE=Mark Rose;388581]Do you have a Kill-a-watt meter or similar? If you know the efficiency of your power supply, you can calculate how much you're drawing from it.
The box I'm typing this on is pulling 760 watts at the wall, but that's only ~625 watts from the 750 watt power supply.[/QUOTE] Flipped the AC on my inverter, 51-53A @ 12V = 625w, including routers and screen in power saving mode... PSU Corsair HX850 90% (ASUS KGPE-D16 2x Opteron 6234 / NH-U12DO, 2x HDD) ... hmm, so about 290watts left... and TDP on the CPUs is 115w each so probably 60w to spare... Edit: Hmm, my inverter efficiently is also peak 89%.. not sure if I need to apply correction to the first 625w... Edit: Tdp on CPU's, hmm, I also usually use turion power control to bump it to always use all cpu boost mode, so I probably use more than the 115... |
[QUOTE=NickOfTime;388583]Flipped the AC on my inverter, 51-53A @ 12V = 625w, including routers and screen in power saving mode...
PSU Corsair HX850 90% (ASUS KGPE-D16 2x Opteron 6234 / NH-U12DO, 2x HDD) ... hmm, so about 290watts left... and TDP on the CPUs is 115w each so probably 60w to spare... Edit: Hmm, my inverter efficiently is also peak 89%.. not sure if I need to apply correction to the first 625w... Edit: Tdp on CPU's, hmm, I also usually use turion power control to bump it to always use all cpu boost mode, so I probably use more than the 115...[/QUOTE] Keep in mind the efficiency at the various loads of your [url=http://www.corsair.com/en-us/hx-series-hx850-power-supply-850-watt-80-plus-gold-certified-modular-psu]HX850[/url]. It looks like max 91%, so I would use that figure for calculating spare room. My power supply has over heated a couple of times when the air filter in the case below it got semi-clogged with dust, causing crashes, so I'd keep an eye on that, too. |
[QUOTE=Mark Rose;388567]
So we still need another 1.8 THz-d/d, if we're giving out 1400 DC assignments per day for the next 26 days.[/QUOTE] My GPUs have finished their queued LLTF and have started on DCTF at a total of 1.3 THz-d/d |
[QUOTE=Chuck;388598]My GPUs have finished their queued LLTF and have started on DCTF at a total of 1.3 THz-d/d[/QUOTE]
Excellent! |
[QUOTE=Chuck;388598]My GPUs have finished their queued LLTF and have started on DCTF at a total of 1.3 THz-d/d[/QUOTE]
Thanks for all the efforts guys! :smile: So everyone knows, I'm moving house over this weekend, so I may be out of contact for a day or so. But will keep an eye on things from time-to-time by way of my Nexus. |
[QUOTE=chalsall;388628]Thanks for all the efforts guys! :smile:
So everyone knows, I'm moving house over this weekend, so I may be out of contact for a day or so. But will keep an eye on things from time-to-time by way of my Nexus.[/QUOTE] One with extreme power and A/C capabilities? :big grin: |
MISFIT is not UPLOADING my results
1 Attachment(s)
I assume it is an ID-10-T error.
In the MISFIT 2.9.2 config parms: Lookin:C:\GIMPS\GPU\MFAKTC-0.20 <== this is where my results.txt file is Here is a sample line or two: no factor for M3776xxxx from 2^70 to 2^71 [mfaktc 0.20 barrett76_mul32_gs] M3777xxxx has a factor: xxxxxx [TF:70:71:mfaktc 0.20 barrett76_mul32_gs] found 1 factor for M3777xxxx from 2^70 to 2^71 [mfaktc 0.20 barrett76_mul32_gs] I know the links to primenet are working because I was able to fetch more work. I added the extra schedule line to see it happen as I watch...I even stopped/started MISFIT after I added the schedule. When the time came and went (and I did UPDATE STATS for good measure) nothing happened; no indication it is trying; no errors; no extra results in PrimeNet; no extra working/save/error files in the MISFIT folder ???? |
[QUOTE=Prime95;388506]petrw1, can you send James a CudaLucas benchmark? I'd like to know the TF breakeven points for this card.[/QUOTE]
CUDALucas up and running. 3.8 ms / ITER for 35.88M DC FFT 2048K Results once an assignment is complete HMMM looking at current benchmarks this is NOT very good. I see 780 for example at about 2 ms for this size DC |
[QUOTE=petrw1;388690]Results once an assignment is complete[/QUOTE]
An assignment is not necessary: See [url]http://www.mersenne.ca/cudalucas.php[/url] for benchmarking details. |
[QUOTE=Prime95;388692]An assignment is not necessary: See [url]http://www.mersenne.ca/cudalucas.php[/url] for benchmarking details.[/QUOTE]
Thanks found it....but man does it take a long time to run.....hours. |
You don't have to let it finish, make a million iterations in bunches of 20k or 100k, (use the keyboard to adjust, or modify the ini) and send a copy of the screen and or logfiles, james is wise enough to extract the right data from it.
|
[QUOTE=petrw1;388634]One with extreme power and A/C capabilities? :big grin:[/QUOTE]
Let's just say I have provisioned an ICT wiring closet (star configuration, of course), all power and telephony runs are underground, as is the server room... (I'm not kidding). :smile: Going to be offline for about 72 hours.... |
[QUOTE]all power and telephony runs are underground, [B][U]as is the server room[/U][/B][/QUOTE]
Got sump pumps and generator? |
[QUOTE=petrw1;388703]Thanks found it....but man does it take a long time to run.....hours.[/QUOTE]
SENT..... |
Sorry people for not contributing for a few weeks. I lost a box a while back.
I think most of its assignments are completed so I didn’t want to release “completed” work. After ordering the wrong part and being out of town I noticed the assignments were going to expire, so I extended them. By having “expired” assignments, I can’t get new work for the other boxes. So I’m doing work around 100M directly from PrimeNet. Hopefully, in the coming days or so, I can contribute to GPU72. |
[QUOTE=Prime95;388506]petrw1, can you send James a CudaLucas benchmark? I'd like to know the TF breakeven points for this card.[/QUOTE]Based on only a single benchmark for Compute 5.2 (from [i]petrw1[/i]'s GTX 970) so I'm only mildly confident in the numbers, but as a rough guide:
[url]http://www.mersenne.ca/cudalucas.php?model=567[/url] |
It looks kinda realistic to me, the 9xx series chips have 1/32 DP/SP ratio, and it can not be changed, like for Titans, to have it doing more SP or more DP (or can it? I didn't have one in my hands, all my information is from the web articles). These cards are kicking ass at factoring, but they a bad for TF.
Edit: well... :redface: for the energy they consume, they are not bad, actually, just they don't get a Titan's performance, but they are certainly competing against a 580 ! |
[QUOTE=James Heinrich;388751]Based on only a single benchmark for Compute 5.2 (from [i]petrw1[/i]'s GTX 970) so I'm only mildly confident in the numbers, but as a rough guide:
[url]http://www.mersenne.ca/cudalucas.php?model=567[/url][/QUOTE] It looks like this card has even lower breakevens than the 570, though not by much. Some cards have TF/LL crossovers a full bit lower. I'm contemplating moving GPU TF assignments to PrimeNet with the goal of having a system that needs less babysitting and is always making sure GPU TF is directed to where it is most profitable. This consists of these steps: 1) Calculate the "high water mark" for DC and LL and 100M LLs each night. From that make an educated guess as to where the high water mark will be in 90 days. This defines 3 areas for handing out GPU TF assignments -- all DC exponents up to the "expected high water mark in 90 days", all LL exponents up to the "expected high water mark in 90 days", and same for 100M LL. 2) Hand out TF assignments that will save the most LL work. I'll try to keep as many of the current GPU72 options as possible, like the ability to specify DCTF only, LLTF only, 100M TF only, specific exponent ranges, specific bit levels, etc. However, let PrimeNet decide should satisfy most users. 3) Real DC and LL assignments for CPUs will still give out the smallest exponent in their category which is highly likely to be the exponents least likely to benefit from more TF. 4) Make sure we don't hand out TF assignments above the crossovers in James' tables. I don't think we are near the crossovers right now. 5) Expire TF assignments in 60(?) days. Does that sound reasonable? 6) Create a Primenet web page to get these GPU TF assignments. 7) Have GPU72 return the assignments it hasn't handed out. Have GPU72 forward TF requests to PrimeNet. Comments? |
[QUOTE]5) Expire TF assignments in 60(?) days. Does that sound reasonable? [/QUOTE]
Thirty days should be more than enough. At least, that is the current GPU 72 period. I personally start checking what the status of a factoring job is if it gets to be 8-10 days old. I expect them to be gone before then. I know some people keep more work in the hopper than I do. Perhaps a longer time allowance would suit them better. |
[QUOTE]6) Create a Primenet web page to get these GPU TF assignments. [/QUOTE]
make sure MISFIT can fetch from there and that the misfit fetch options are compatable with the new page |
[QUOTE=LaurV;388753] These cards are kicking ass at factoring, but they a bad for TF.[/QUOTE]
I know(?) You meant to say LL at the end of this quote. I guess I might as well let it do what it does best. TF. :) |
[QUOTE=petrw1;388792]I know(?) You meant to say LL at the end of this quote.
I guess I might as well let it do what it does best. TF. :)[/QUOTE] Grr man, LL, sorry. Been in hurry. |
[QUOTE=James Heinrich;388751]Based on only a single benchmark for Compute 5.2 (from [i]petrw1[/i]'s GTX 970) so I'm only mildly confident in the numbers, but as a rough guide:
[url]http://www.mersenne.ca/cudalucas.php?model=567[/url][/QUOTE] Is there enough info in [url=http://mersenneforum.org/showpost.php?p=383456&postcount=2364]this post[/url] to add to the benchmarks? If not, perhaps TheJudger could send you what's needed. |
[QUOTE=Mark Rose;388798]Is there enough info in [url=http://mersenneforum.org/showpost.php?p=383456&postcount=2364]this post[/url] to add to the benchmarks? If not, perhaps TheJudger could send you what's needed.[/QUOTE]I have 3 benchmarks (GTX 970/980) for mfaktc and they all line up pretty good, it's on CUDALucas that I only have a single benchmark.
|
[QUOTE=James Heinrich;388802]I have 3 benchmarks (GTX 970/980) for mfaktc and they all line up pretty good, it's on CUDALucas that I only have a single benchmark.[/QUOTE]
Ahh, gotcha. Perhaps you could ask TheJudger for another? |
[QUOTE=Prime95;388755]It looks like this card has even lower breakevens than the 570, though not by much. Some cards have TF/LL crossovers a full bit lower.[/QUOTE]
[quote] 4) Make sure we don't hand out TF assignments above the crossovers in James' tables. I don't think we are near the crossovers right now.[/quote]For many cards it would be beneficial to go to 75bits for LL candidates 70M and higher and 72bits for DC candidates >40M, but we currently don't have the firepower to do so. [quote] 5) Expire TF assignments in 60(?) days. Does that sound reasonable?[/quote]30 days with the possibility to extent them has worked great for GPU72 in the past, but I don't see any problem with 60 days. [quote] 7) Have GPU72 return the assignments it hasn't handed out. Have GPU72 forward TF requests to PrimeNet.[/quote]It would be nice to see the Work Distribution page provided the 'complete picture' again. Now almost the entire 70-79M range is reserved by GPU72, while only 3,000 out of the 167,000 exponents are actively worked on. |
[QUOTE=kladner;388758]Thirty days should be more than enough. At least, that is the current GPU 72 period. I personally start checking what the status of a factoring job is if it gets to be 8-10 days old. I expect them to be gone before then. I know some people keep more work in the hopper than I do. Perhaps a longer time allowance would suit them better.[/QUOTE]
I agree. Even on an old, slow card, such as a GT 520 or GT 430, a 70->75 75M LLTF assignment could be completed in 2 to 4 days. With an automated system even two weeks is more than enough. Might be a nuisance for people who do things manually though. |
Would it be possible to assign DC's that have not been appropriately TF'ed (say, one bit level short of optimal) to people who have never once reported a result?
Every single person would do [B][I]at most[/I][/B] one less-than-optimally-TFed job in their entire life but it would would divert all properly factored exponents to the people way more likely to complete them. |
[QUOTE=TheMawn;388865]Would it be possible to assign DC's that have not been appropriately TF'ed (say, one bit level short of optimal) to people who have never once reported a result?
Every single person would do [B][I]at most[/I][/B] one less-than-optimally-TFed job in their entire life but it would would divert all properly factored exponents to the people way more likely to complete them.[/QUOTE] Well, we shouldn't need to, we are ahead of the cat 4 churn now 1800 vs 1400 assigned. [URL="http://www.gpu72.com/graphs/dctf/week/"]http://www.gpu72.com/graphs/dctf/week/[/URL] |
I'm working on the Primenet GPU TF assignment page. It works much like the GPU72 page.
Two questions: 1) The system is designed to work one bit-level at a time. Will this create some work units that are just too short? Should I modify the assignments to do multiple bit levels if the current bit level is somewhat low (or should I leave this under user control by filling out the "optional will factor to" field)? If automatic, what are the recommended minimum bit level to factor to for a DC, LL, and 100M? 2) What is the current TF/LL crossovers for GPUs on 100M digit numbers? |
Factoring to bit levels at or below 69 should probably be assigned as a single assignment. At 80M, that's about 1.5 GHz-days, and even an ancient card can do several of those per day. A card like a GTX 580 or GTX 970 can do about 300 of those a day.
I would let users pick a maximum bit-level, minimum 69. If we start factoring to 75, a 71->75 at 75M would take about 95 GHz-days. It could discourage new users if they have slow cards that can't finish a single assignment in a day. I think doing the same thing with LL/DC categories and assign high work to users who have never returned TF work might be a good idea. GPU72 current limits the amount of work given out to new users and PrimeNet should do the same. Until recently, AMD cards had a penalty factoring beyond 73 bits (I don't know if the version of mfakto has been released with the newer kernel). It might be useful to add a suggested bit-level field (hidden?) for automated clients to pass that indicates at what bit level performance drops. Those clients could still work on higher bit levels if that's what the system needs, but ideally would work on lower bit levels for overall system throughput. That's one feature "Let GPU72 decide" lacks. According to mersenne.ca, the 100M cross-over level for a GTX 970 is 77 bits LLTF and 76 bits DCTF. For a GTX 780 is 76 and 75. I doubt that takes into account the severe performance hit mfaktc has above 76 bits. This is where that suggested bit-level field would come in handy. |
[QUOTE=Mark Rose;389033]
According to mersenne.ca, the 100M cross-over level for a GTX 970 is 77 bits LLTF and 76 bits DCTF. For a GTX 780 is 76 and 75. I doubt that takes into account the severe performance hit mfaktc has above 76 bits. This is where that suggested bit-level field would come in handy.[/QUOTE] I think you might be mixing up 100M 'exponent' (2^100,000,000-1) and 100M digits (2^332,200,000-1). For 100M [B]exponent[/B] the cross-over is indeed +/- 77bits. For 100M [B]digits[/B] candidates the 'normal/CPU' TF bitlevel is already 77bits, so with GPUs you could probably do 4-5bits more, so 81bits, maybe even 82bits if there are enough resources. |
[QUOTE=Mark Rose;389033]
According to mersenne.ca, the 100M cross-over level for a GTX 970 is 77 bits LLTF and 76 bits DCTF. For a GTX 780 is 76 and 75. I doubt that takes into account the severe performance hit mfaktc has above 76 bits. This is where that suggested bit-level field would come in handy.[/QUOTE] I'm asking about 100M digits or M332192000. I think the chart you refer to is for M100000000. A prototype of the PrimeNet web page is: [url]http://mersenne.org/manual_gpu_assignment/[/url] Feel free to click on getting assignments, it will display work without making any real reservations. Note that the assignments returned is not what you'll eventually get since GPU72 has nearly all the relevant DC and LL exponents reserved. |
[QUOTE=Mark Rose;389033]According to mersenne.ca ... I doubt that takes into account the severe performance hit mfaktc has above 76 bits.[/QUOTE]Correct. My performance charts are based on a simple 1-dimensional measurement, it does not scale appropriately non-linearly where different bit levels or kernels are invoked. Something I should probably look at in the future, I guess.
|
[QUOTE=Prime95;389040]A prototype of the PrimeNet web page is: [URL]http://mersenne.org/manual_gpu_assignment/[/URL]
[/QUOTE] Suggestion for change: Preferred work range: [I]"[/I]TF for 100M[B] digits [/B]exponents". |
[QUOTE=Mark Rose;389033]
Until recently, AMD cards had a penalty factoring beyond 73 bits (I don't know if the version of mfakto has been released with the newer kernel). [/QUOTE] I wanted to release this version before LaurV passes me in the overall GPU72 stats, but I lost that race :down: AMD GCN cards will still have some slowdown from 73 to 74, and another one from 74 to 75 bits. And there's one from 70 to 71. And from 82 to 83. And ... It's a bit like LL-tests and different FFT lengths: bigger problems require bigger effort. The "penalty" comes from the GHz-days calculation that does not take that into account for TF - and it would be hard to do that, given the hardware differences. [QUOTE=Prime95;389040] A prototype of the PrimeNet web page is: [URL]http://mersenne.org/manual_gpu_assignment/[/URL] Feel free to click on getting assignments, it will display work without making any real reservations. Note that the assignments returned is not what you'll eventually get since GPU72 has nearly all the relevant DC and LL exponents reserved.[/QUOTE] Seeing that the AMD cards are not available in the "GPU info" selection makes me think if it is really meaningful what card is running the TF. If so, then some (imaginary) hardware that cannot run any type of LL-tests would need to do TF up to bitlevel = exponent -1? Or put differently, why care about the GPU-specific crossover point if LL is done on different hardware anyway? Even if all TF was done on a single type of GPU, the crossover point is meaningless compared to the ratio of LL-power vs. TF-power that the folks are willing to spend. At least, until primenet also balances the worktype by hardware (TF vs. LL, or even P-1, for that matter), which some contributors may dislike. I think, the [B]average [/B]crossover point can only be taken as a general guidance when suggesting to switch cards between TF and LL. Preferrably, the cards with a lower crossover point switch to LL first, but it's a user decision. The resulting firepower should decide about how far to factor - quite independent of the GPU type. Using the "GPU info" for considering the above-mentioned steps in the TF efficiency could be useful but hard to implement and maintain. |
[QUOTE=Bdot;389056]
Seeing that the AMD cards are not available in the "GPU info" selection makes me think if it is really meaningful what card is running the TF. If so, then some (imaginary) hardware that cannot run any type of LL-tests would need to do TF up to bitlevel = exponent -1? Or put differently, why care about the GPU-specific crossover point if LL is done on different hardware anyway?[/QUOTE] The only time the GPU info comes into play now is if you select "smallest exponent". Without the GPU info my GTX 570 would have been assigned a M54,xxx,xxx to 2^75. My GPU would be better off doing LLs in the 54M range. With the GPU info, I now get 60M exponents, a GTX770 would get a 65M exponent. Since we are nowhere near having enough GPU firepower, the "what makes sense" choice does not care about the GPU info. If the AMD (of Nvidia) barrett kernels are significantly worse going up one particular bit level then I could modify the SQL query to prefer the faster bit levels for that GPU. |
[QUOTE=Bdot;389056]
Or put differently, why care about the GPU-specific crossover point if LL is done on different hardware anyway? Even if all TF was done on a single type of GPU, the crossover point is meaningless compared to the ratio of LL-power vs. TF-power that the folks are willing to spend. At least, until primenet also balances the worktype by hardware (TF vs. LL, or even P-1, for that matter), which some contributors may dislike. I think, the [B]average [/B]crossover point can only be taken as a general guidance when suggesting to switch cards between TF and LL. Preferrably, the cards with a lower crossover point switch to LL first, but it's a user decision. The resulting firepower should decide about how far to factor - quite independent of the GPU type. [/QUOTE] +1 to this, especially to the last part. |
[QUOTE=Bdot;389056]I wanted to release this version before LaurV passes me in the overall GPU72 stats, but I lost that race :down:
AMD GCN cards will still have some slowdown from 73 to 74, and another one from 74 to 75 bits. And there's one from 70 to 71. And from 82 to 83. And ... It's a bit like LL-tests and different FFT lengths: bigger problems require bigger effort. The "penalty" comes from the GHz-days calculation that does not take that into account for TF - and it would be hard to do that, given the hardware differences. [/quote] How much of it is the GHz-d calculation and how much of it is extra math? I haven't looked very much at mfakto's code. I'm curious. mfaktc's barrett76 kernel needs only 5 32-bit ints and 9 multiplies for a 76 bit x 76 bit product, but the barrett77 kernel requires 6 ints and 12 multiplies for 77 bit x 77 bit product. There's about a 20% drop in performance going from 76 to 77 bits, before taking into account the GHz-d formula penalty for higher bit levels. |
[QUOTE=Prime95;389063]The only time the GPU info comes into play now is if you select "smallest exponent". Without the GPU info my GTX 570 would have been assigned a M54,xxx,xxx to 2^75. My GPU would be better off doing LLs in the 54M range. With the GPU info, I now get 60M exponents, a GTX770 would get a 65M exponent.
[/quote] I think the current menu may actually confuse users whose cards aren't list. For instance, I have two GT 520's and two GT 430's crunching DCTF (I got started with these cards and others will too). Also, all the 4xx and 5xx hardware (except the GT 405), and [i]some[/i] of the GT 6xx and GT 7xxM models have Fermi architecture and will have a similar crossover point. The current list also breaks down for the 7xx series. The GTX 750 (Maxwell) is much worse at LL than the GTX 760/770/780 (Kepler). Would it make more sense to have a list of architectures? Fermi, Kepler, Maxwell, and "Don't know" which aliases to the lowest crossover point or something like that? I actually still have a Tesla architecture card, but it's truly not worth bothering with at this point, getting about 3 GHz-d/d and stealing CPU from mprime since it can't do GPU sieving. [quote]Since we are nowhere near having enough GPU firepower, the "what makes sense" choice does not care about the GPU info. If the AMD (of Nvidia) barrett kernels are significantly worse going up one particular bit level then I could modify the SQL query to prefer the faster bit levels for that GPU.[/QUOTE] There's a big ~20% performance hit beyond 76 bits for all Nvidia cards. |
Whatever you make please remember the users who need to reserve and submit work manually.
:spot: |
[QUOTE=Mark Rose;389079]I think the current menu may actually confuse users whose cards aren't list.
Would it make more sense to have a list of architectures? Fermi, Kepler, Maxwell, and "Don't know" which aliases to the lowest crossover point or something like that? [/quote] That might be a good idea -- or I could delete the gpu info completely and assume the lowest crossovers since 99% of the time the information will not be used. Or see below for where I might use this info more often... [quote]There's a big ~20% performance hit beyond 76 bits for all Nvidia cards.[/QUOTE] 20% may well be worth taking into consideration. This would delay when Primenet starts handing out 76 bit factoring. Am I right in thinking I need 3 different tables: mfaktc, mfakto on GCN, mfakto on non-GCN. For each I need to know the bit levels at which the program uses a slower kernel and how much slower the kernel is than the previous kernel. I need this data for bit levels 69 to ~84. Right now I have one datapoint: mfaktc, 76 bits, 20% slower |
[QUOTE=Prime95;389040]I'm asking about 100M digits or M332192000. I think the chart you refer to is for M100000000.
[/QUOTE] [QUOTE=James Heinrich;389042]Correct. My performance charts are based on a simple 1-dimensional measurement, it does not scale appropriately non-linearly where different bit levels or kernels are invoked. Something I should probably look at in the future, I guess.[/QUOTE] This post contains a chart that may be helpful: [url]http://mersenneforum.org/showpost.php?p=324680&postcount=413[/url] |
[QUOTE=Uncwilly;389094]This post contains a chart that may be helpful[/QUOTE]That has essentially the same information as is shown (different presentation now) on [url=http://www.mersenne.ca/cudalucas.php?model=9]this page[/url], but it's still based on the flawed assumption that all GPUs of any particular compute level perform identically across all bitlevels. I should take into account things like the aforementioned performance drop above 2[sup]76[/sup] but I don't.
Once we gather the stats for various architectures at various bitlevels for George, I'll see about incorporating that new information into my charts and graphs as well. |
[QUOTE=Prime95;389091]That might be a good idea -- or I could delete the gpu info completely and assume the lowest crossovers since 99% of the time the information will not be used. Or see below for where I might use this info more often...
20% may well be worth taking into consideration. This would delay when Primenet starts handing out 76 bit factoring. Am I right in thinking I need 3 different tables: mfaktc, mfakto on GCN, mfakto on non-GCN. For each I need to know the bit levels at which the program uses a slower kernel and how much slower the kernel is than the previous kernel. I need this data for bit levels 69 to ~84. Right now I have one datapoint: mfaktc, 76 bits, 20% slower[/QUOTE] Trial factoring anything up to 76 bits is fast with mfaktc. Trial factoring 77 bits is slower. Here are some GHz-d/day numbers for M39467291 on a GTX 580 (at 1544MHz): 69,70: 426.02 70,71: 426.02 71,72: 424.85 72,73: 424.52 73,74: 424.32 74,75: 424.56 75,76: 424.39 76,77: 423.28 77,78: 414.38 // okay, not as bad as I remembered! The 20% I remembered was from [url=http://mersenneforum.org/showpost.php?p=306572&postcount=1824]this post[/url]. The new barrett76 kernel is only usable for 76 bits (77 overflows), and so a less efficient kernel must be used. 78,79: 414.24 79,80: 414.32 I don't have time to do more benchmarking at the moment. |
[QUOTE=Mark Rose;389073]How much of it is the GHz-d calculation and how much of it is extra math? I haven't looked very much at mfakto's code. I'm curious. mfaktc's barrett76 kernel needs only 5 32-bit ints and 9 multiplies for a 76 bit x 76 bit product, but the barrett77 kernel requires 6 ints and 12 multiplies for 77 bit x 77 bit product. There's about a 20% drop in performance going from 76 to 77 bits, before taking into account the GHz-d formula penalty for higher bit levels.[/QUOTE]
[I]Barrett[/I] is more than just one square operation (for which you counted to ops). [QUOTE=Mark Rose;389079]There's a big ~20% performance hit beyond 76 bits for all Nvidia cards.[/QUOTE] [B]Not true[/B], but to be fair you've noticed yourself. See below. [QUOTE=Mark Rose;389148]Trial factoring anything up to 76 bits is fast with mfaktc. Trial factoring 77 bits is slower. Here are some GHz-d/day numbers for M39467291 on a GTX 580 (at 1544MHz): 69,70: 426.02 70,71: 426.02 71,72: 424.85 72,73: 424.52 73,74: 424.32 74,75: 424.56 75,76: 424.39 76,77: 423.28 77,78: 414.38 // okay, not as bad as I remembered! The 20% I remembered was from [url=http://mersenneforum.org/showpost.php?p=306572&postcount=1824]this post[/url]. The new barrett76 kernel is only usable for 76 bits (77 overflows), and so a less efficient kernel must be used. 78,79: 414.24 79,80: 414.32 I don't have time to do more benchmarking at the moment.[/QUOTE] You'll see the same performance up to 2[SUP]87[/SUP] and a very minor performance drop to 2[SUP]88[/SUP]. Above 2[SUP]88[/SUP] there will be a bigger drop. [B][U]RAW[/U][/B] kernel benchmarks (million FCs per second without sieve): [CODE]GeForce GT 440 (CC 2.1) mfaktc 0.21-pre4 // 319.60 + CUDA 5.5 ./mfaktc.exe -tf 66362159 66 67 71bit_mul24 29.23M/s 75bit_mul32 42.22M/s 95bit_mul32 33.16M/s barrett76_mul32 79.23M/s barrett77_mul32 74.94M/s barrett79_mul32 64.18M/s barrett87_mul32 75.51M/s barrett88_mul32 75.46M/s barrett92_mul32 61.93M/s ------------------------------------- Tesla K20m (CC 3.5) mfaktc 0.21-pre4 // 331.20 + CUDA 5.5 ./mfaktc.exe -tf 66362159 68 69 71bit_mul24 160.51M/s 75bit_mul32 200.32M/s 95bit_mul32 155.13M/s barrett76_mul32 392.31M/s barrett77_mul32 367.17M/s barrett79_mul32 314.82M/s barrett87_mul32 368.01M/s (without funnel-shift 357.09M/s) barrett88_mul32 367.45M/s (without funnel-shift 347.80M/s) barrett92_mul32 306.60M/s (without funnel-shift 293.69M/s) ------------------------------------- GeForce GTX 275 (CC 1.3) mfaktc 0.21-pre5 // 319.37 + CUDA 5.5 ./mfaktc.exe -tf 66362159 66 67 71bit_mul24 77.64M/s 75bit_mul32 62.59M/s 95bit_mul32 50.34M/s barrett76_mul32 85.83M/s barrett77_mul32 82.48M/s barrett79_mul32 73.56M/s barrett87_mul32 75.93M/s barrett88_mul32 75.41M/s barrett92_mul32 65.80M/s[/CODE] With Sieving there will be a constant penalty added to each kernel so the relative performance difference between those kernels will be a little bit smaller than the [B][U]RAW[/U][/B] speeds suggests. barrett76,77 and 79 can do 2[SUP]64[/SUP] to 2[SUP]<upper limit for the kernel>[/SUP] in ONE step while barrett87, 88 and 92 can only do one bitlevel at once. But above 2[SUP]76[/SUP] I think this is not a real concern. Oliver |
So if I read the GPU72 report correctly...and if hypothetically we maintain the current DC-TF rate...then DC-TF would be a thing of the past before the end of 2015...
|
[QUOTE=Mark Rose;389148]Here are some GHz-d/day numbers for M39467291 on a GTX 580 (at 1544MHz):[/QUOTE]Same idea I had, I just did the same kind of benchmark on M99999989 on a GTX 570:[code]66,67 = 274.5
67,68 = 274.5 68,69 = 278.8 69,70 = 280.2 70,71 = 279.9 71,72 = 280.0 72,73 = 280.0 73,74 = 276.9 74,75 = 276.9 75,76 = 276.9 76,77 = 268.0 77,78 = 270.7 78,79 = 270.6 79,80 = 270.6[/code] |
[QUOTE=TheJudger;389158][I]Barrett[/I] is more than just one square operation (for which you counted to ops).[/quote]
When I was looking at the code, which was a while ago, it seemed to me that the square operation was where the most operations were saved with the barrett76 kernel versus the others. Is there a significant difference in the number of operations elsewhere in the barrett76 kernel? I ask only to understand better! [quote] [B]Not true[/B], but to be fair you've noticed yourself. See below. You'll see the same performance up to 2[SUP]87[/SUP] and a very minor performance drop to 2[SUP]88[/SUP]. Above 2[SUP]88[/SUP] there will be a bigger drop. With Sieving there will be a constant penalty added to each kernel so the relative performance difference between those kernels will be a little bit smaller than the [B][U]RAW[/U][/B] speeds suggests. barrett76,77 and 79 can do 2[SUP]64[/SUP] to 2[SUP]<upper limit for the kernel>[/SUP] in ONE step while barrett87, 88 and 92 can only do one bitlevel at once. But above 2[SUP]76[/SUP] I think this is not a real concern. [/QUOTE] Thanks for the corrections! |
I'm anxious to see similar numbers for AMC VLIW and GCN cards using the soon-to-be-released mfakto.
Looking at Mark's numbers I'm leaning toward removing GPU info from the web form. The 3% speed difference between lowest and highest bit levels isn't worth worrying about. As for LL/TF crossovers that only come into play if one chooses the lowest exponent preference, I'll assume a GTX 770 which does the least TF before LL becomes a more profitable use of the card. For those that are looking to maximize their GHz-days/day, the optional bit level to factor to and exponent range can always be used to get suitable work. |
Hi,
[QUOTE=Mark Rose;389163]When I was looking at the code, which was a while ago, it seemed to me that the square operation was where the most operations were saved with the barrett76 kernel versus the others. Is there a significant difference in the number of operations elsewhere in the barrett76 kernel? I ask only to understand better![/QUOTE] mfaktc evolution // basic ideas:[LIST=1][*]barrett_92 is "basic barrett" with full 96/192 bit integer[*]barrett_79 is like barrett_92 with [B]fixed[/B] (2[SUP]160[/SUP]) value for the scaled integer inverse, this[LIST][*]saves some multiword shifts (compare both kernels) because 2[SUP]160[/SUP] = 2[SUP]5*32[/SUP] ist easy to shift on 32bit words[*]allows multiple bitlevels at once[*]reduces the upper limit for fixed size integers (compared to barrett_92)[/LIST][*]reduced accuracy in interim steps[SUP]*1[/SUP][LIST][*]barrett_87/88 are barrett_92 with less accuracy in interim steps[*]barrett_76/77 are barrett_79 with less accuracy in interim steps[/LIST][/LIST] [SUP]*1[/SUP]Less accuracy as in: "n mod f == x + <small integer> * f", e.g. 1234 mod 10 = 24 (instead of 4) Oliver |
Thank you!
That gives me the insight needed to study the code further :) |
[QUOTE=Prime95;389091]That might be a good idea -- or I could delete the gpu info completely and assume the lowest crossovers since 99% of the time the information will not be used. Or see below for where I might use this info more often...
[/QUOTE] I think using the lowest crossovers makes sense at least until we are reasonably confident that we can factor everything to that level. We are not at that point yet. |
[QUOTE=James Heinrich;389160]Same idea I had, I just did the same kind of benchmark on M99999989 on a GTX 570:[code]66,67 = 274.5
67,68 = 274.5 68,69 = 278.8 69,70 = 280.2 70,71 = 279.9 71,72 = 280.0 72,73 = 280.0 73,74 = 276.9 74,75 = 276.9 75,76 = 276.9 76,77 = 268.0 77,78 = 270.7 78,79 = 270.6 79,80 = 270.6[/code][/QUOTE] Same exponent on my R9 285(AMD GCN)... [code] 66,67 = 433.9 67,68 = 433.9 68,69 = 433.9 69,70 = 407.6 70,71 = 369.1 71,72 = 369.1 72,73 = 369.1 73,74 = 357.3 74,75 = 329.1 75,76 = 327.7 76,77 = 327.7 ~Not much more variation until past 82 bits~ [/code] |
[QUOTE=Prime95;389174]I'm anxious to see similar numbers for AMC VLIW and GCN cards using the soon-to-be-released mfakto.
[/QUOTE] Here are a few results that I received for the test version. It shows the best kernel per bitlevel for a real GPU-sieve run of about 3 seconds per test. This is VLIW5 (HD6550D from an APU): [code] Resulting speed for M66362159: bit_min - bit_max GHz-days/day kernelname 60 - 64 39.993 cl_barrett15_69_gs 64 - 76 41.567 cl_barrett32_76_gs 76 - 77 39.926 cl_barrett32_77_gs 77 - 87 39.404 cl_barrett32_87_gs 87 - 88 36.753 cl_barrett32_88_gs 88 - 92 35.173 cl_barrett32_92_gs [/code]This is a first-generation GCN with 1:4 DP (HD7950): [code] Resulting speed for M66362159: bit_min - bit_max GHz-days/day kernelname 60 - 69 499.674 cl_barrett15_69_gs 69 - 70 476.535 cl_barrett15_71_gs 70 - 73 427.422 cl_barrett15_73_gs 73 - 74 412.430 cl_barrett15_74_gs 74 - 82 378.749 cl_barrett15_82_gs 82 - 83 354.878 cl_barrett15_83_gs 83 - 87 330.658 cl_barrett32_87_gs 87 - 88 327.284 cl_barrett15_88_gs 88 - 92 289.456 cl_barrett32_92_gs [/code]This is a newer GCN with 1:16 DP (R285) [code] Resulting speed for M66362159: bit_min - bit_max GHz-days/day kernelname 60 - 69 475.043 cl_barrett15_69_gs 69 - 70 443.636 cl_barrett15_71_gs 70 - 73 402.419 cl_barrett15_73_gs 73 - 74 389.251 cl_barrett15_74_gs 74 - 82 334.707 cl_barrett15_82_gs 82 - 83 313.099 cl_barrett15_83_gs 83 - 87 294.592 cl_barrett32_87_gs 87 - 88 291.010 cl_barrett15_88_gs 88 - 92 258.739 cl_barrett32_92_gs [/code]This is the new top-level GCN with improved int32 math (R290x): [code] Resulting speed for M66362159: bit_min - bit_max GHz-days/day kernelname 60 - 69 757.628 cl_barrett15_69_gs 69 - 76 749.778 cl_barrett32_76_gs 76 - 77 720.362 cl_barrett32_77_gs 77 - 79 645.730 cl_barrett32_79_gs 79 - 87 642.553 cl_barrett32_87_gs 87 - 88 614.766 cl_barrett32_88_gs 88 - 92 565.309 cl_barrett32_92_gs [/code]And finally, this is Intel HD4600: [code] Resulting speed for M66362159: bit_min - bit_max GHz-days/day kernelname 60 - 64 15.081 cl_barrett15_69_gs 64 - 76 19.707 cl_barrett32_76_gs 76 - 77 19.345 cl_barrett32_77_gs 77 - 87 17.208 cl_barrett32_87_gs 87 - 88 16.816 cl_barrett32_88_gs 88 - 92 14.507 cl_barrett32_92_gs [/code]The exponent size also has some influence on the performance, for example R290x: [code] M2000093: 69 - 76 942.631 cl_barrett32_76_gs M39000037: 69 - 76 749.779 cl_barrett32_76_gs M66362159: 69 - 76 749.778 cl_barrett32_76_gs M74000077: 69 - 76 721.255 cl_barrett32_76_gs M78000071: 69 - 76 720.259 cl_barrett32_76_gs M332900047: 69 - 76 667.219 cl_barrett32_76_gs M999900079: 69 - 76 645.028 cl_barrett32_76_gs M2001862367: 64 - 76 621.290 cl_barrett32_76_gs M4201971233: 69 - 76 602.682 cl_barrett32_76_gs [/code] |
1 Attachment(s)
[QUOTE=Bdot;389225]The exponent size also has some influence on the performance, for example R290x[/QUOTE]
I do take this into account by assuming the first 7 bits of the exponent are "free" -- multiplying the TF cost by (ceil (log2 (exponent)) - 7) The full SQL code currently is attached |
[QUOTE=Bdot;389225]Here are a few results that I received for the test version.[/QUOTE]
Thanks. Some of the GCN cards do show significant drops as bit levels increase. |
[QUOTE=Prime95;389025]I'm working on the Primenet GPU TF assignment page. It works much like the GPU72 page.
Two questions: 1) The system is designed to work one bit-level at a time. Will this create some work units that are just too short? Should I modify the assignments to do multiple bit levels if the current bit level is somewhat low (or should I leave this under user control by filling out the "optional will factor to" field)? If automatic, what are the recommended minimum bit level to factor to for a DC, LL, and 100M? 2) What is the current TF/LL crossovers for GPUs on 100M digit numbers?[/QUOTE] Great, does that mean we are finally going to get rid of the GPU work showing up as status "U" and each card being properly recognised as a processor in it's own right? (The CPU's table in the my account page) and we'll be able to drill down into it as you can for cpu's? |
[QUOTE=Gordon;389367]Great, does that mean we are finally going to get rid of the GPU work showing up as status "U" and each card being properly recognised as a processor in it's own right? (The CPU's table in the my account page) and we'll be able to drill down into it as you can for cpu's?[/QUOTE]
No plans along those lines yet. |
Nah. If we have GPU's recognizable as hardware on Primenet then I won't be working at 1900% capacity anymore :sad:
|
[QUOTE=TheMawn;389381]Nah. If we have GPU's recognizable as hardware on Primenet then I won't be working at 1900% capacity anymore :sad:[/QUOTE]
My last 24 hours - 2720. Must be some on here who are up in the 10,000% range... |
What is more valuable ATM...
....in terms of CPU time: LLDC or P-1? I am currently running 50-50 on a FX-8350. Having played around with different combinations, I have a notion that this 'shared FPU' design works better if you put the same kind of work on each pair: 1-2, 3-4, 5-6, 7-8, in which each 2 integer cores share an FPU.
I also have 32 GB of RAM. What is in more demand? |
1) DCs are lagging first time LLs by an ever increasing amount.
2) First time LLs are the only type of work that can reveal a new prime. 3) 32 GB is great for P-1 (more than one test simultaneously) Sounds conflicting, isn´t it? :confused: If I were you, taking into account 1) and 3) above, I would go for a mix of DCs and P-1s, also considering that P-1s will also help advancing first time LL work, by finding sometimes new factors and in any case offloading work from 1st time testers (no more need to perform a P-1 test before starting the actual LL). All in all, do as you see fit... all contributions are valuable! |
1 Attachment(s)
[QUOTE=Gordon;389390]My last 24 hours - 2720. Must be some on here who are up in the 10,000% range...[/QUOTE]:mike:
|
[QUOTE=Xyzzy;389458]:mike:[/QUOTE] :max:
|
| All times are UTC. The time now is 22:33. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.