![]() |
|
|
#1838 | |
|
"Oliver"
Mar 2005
Germany
11×101 Posts |
Quote:
This seems to occur when the exponent has continuous 1 in binary representation (which causes the "optional multiply by 2"). I'm not sure whether this is the only cause or not. The kernel works absolute save for FCs up to 276 so there is a very high chance that mfaktc 0.19 will feature a new kernel: barrett76_mul32. I need to check on my CC 1.3 GPU, too. I guess this will be the fastest kernel for those old GPUs, too. ![]() George: I want to test the current code on my GTX 275 this evening, after that I'll sent you the new code (which features some debugging code, too). Oliver |
|
|
|
|
|
|
#1839 | |
|
P90 years forever!
Aug 2002
Yeehaw, FL
2·32·419 Posts |
Quote:
We are multiplying a 3 word value (floor (2^160/f)) by a 5 word value and ignoring the 5 bottom words of the result. By my reckoning the big multiply is ignoring 6 partial results in the the 4th word which could generate 5 carries. Also, accounting for the error introduced by the floor function introduces another possible carry. Thus, the quotient can be off by up to 6. The doubling gives us off by 12, which means we need 4 pad bits -- just as Oliver observed. |
|
|
|
|
|
|
#1840 |
|
"Mike"
Aug 2002
200658 Posts |
Due to a recent hike in our electricity rate, too much current draw, inadequate branch circuits and an inadequate central air cooling system, we are going to drop off of trial factoring with our GPUs.
This decision was made two months ago but we apparently were in denial (Not the river in Egypt!) that the bills we were receiving were anomalies. We plan to remove them today and put the boxes on P-1, using the resources at gpu72.com. of course. This will drop our electrical usage by about half. So, we have four GPUs to sell, cheap. http://www.newegg.com/Product/Produc...82E16814121432 $200 each, plus shipping and insurance. They do not have a warranty but they have run 24×7 for a long time with no issues. Be aware that each GPU takes up three slots! If you are interested in one or more of these, please PM us. |
|
|
|
|
|
#1841 | |
|
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
100101000110012 Posts |
These are good ones! :tempted:
Quote:
|
|
|
|
|
|
|
#1842 |
|
"Kieren"
Jul 2011
In My Own Galaxy!
27AE16 Posts |
I would have jumped on one of these in an instant had I not just gotten a Gigabyte 570 off eBay for about the same price.
|
|
|
|
|
|
#1843 | |
|
Mar 2003
Melbourne
5×103 Posts |
Noooooooooo....
Who's going to keep me company at the top. :) BTW I might be moving in Sept. There maybe some down time for me coming up. -- Craig Quote:
|
|
|
|
|
|
|
#1844 |
|
"Kieren"
Jul 2011
In My Own Galaxy!
2×3×1,693 Posts |
|
|
|
|
|
|
#1845 |
|
"Oliver"
Mar 2005
Germany
11·101 Posts |
Hi all,
here is a small teaser for mfaktc 0.19 RAW GPU performance CUDA 4.2, stock GTX 470 (1215MHz): mfaktc 0.18: Code:
kernel | M66362159 above 2^64 | M3321932839 above 2^64 -------+----------------------+----------------------- 71bit | 106.0M/s | 81.6M/s 75bit | 200.0M/s | 156.2M/s 95bit | 160.2M/s | 124.8M/s 76bit | n.a. | n.a. 79bit | 335.4M/s | 262.1M/s 92bit | 267.7M/s | 211.2M/s Code:
kernel | M66362159 above 2^64 | M3321932839 above 2^64 -------+----------------------+----------------------- 71bit | 106.0M/s | 81.5M/s 75bit | 214.7M/s | 168.1M/s 95bit | 169.5M/s | 132.2M/s 76bit | 424.7M/s | 334.5M/s 79bit | 343.5M/s | 268.1M/s 92bit | 276.4M/s | 217.8M/s most of the 75bit, 95bit, 79bit and 92bit improvement is related to the optimizations of the squaring function (thank you, George!). I guess that older GPUs (CC 1.x) don't see any improvement. The new 76bit barrett kernel is nice, take the 79bit barrett kernel, (re-)move some lines of code and you're mostly done. For the future it might be possible to add more kernels:
Release plan:
Oliver Last fiddled with by TheJudger on 2012-08-07 at 16:36 |
|
|
|
|
|
#1846 |
|
Jul 2012
Saarland / Germany
22·17 Posts |
cool !
|
|
|
|
|
|
#1847 |
|
"Kieren"
Jul 2011
In My Own Galaxy!
2×3×1,693 Posts |
|
|
|
|
|
|
#1848 | |
|
"Mike"
Aug 2002
824510 Posts |
Quote:
If we take the purchase price for the four GPUs (\$1320) and subtract what we sold them for (\$800) we have a net cost of \$520 or so. We think (?) we used them for about 400,000 GHz/days of work so our cost per GHz/day, not counting the host computers, which are still happily churning away, is 0.13¢ per GHz/day. (Our math might be wrong.) That seems like a reasonable ROI.
|
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1676 | 2021-06-30 21:23 |
| The P-1 factoring CUDA program | firejuggler | GPU Computing | 753 | 2020-12-12 18:07 |
| gr-mfaktc: a CUDA program for generalized repunits prefactoring | MrRepunit | GPU Computing | 32 | 2020-11-11 19:56 |
| mfaktc 0.21 - CUDA runtime wrong | keisentraut | Software | 2 | 2020-08-18 07:03 |
| World's second-dumbest CUDA program | fivemack | Programming | 112 | 2015-02-12 22:51 |