![]() |
344.65 has yet to dump on me. My suspicions about 344.75 are solidifying.
|
Our work boxes are not connected to the Internet. We have a USB key setup with a Windows 7 installer and the few files necessary to set up a barebones CUDA environment. Once a particular program or driver has proved itself to be stable we have no intention of ever replacing it. We can replicate our install from scratch in under 30 minutes. (The configuration files are already built on the USB key.) We have every step written down on a piece of paper and we do the absolute minimum "tweaking" possible to ensure a stable system.
|
[QUOTE=kladner;388258]Has anyone else noticed a change in behavior with 344.75?[/QUOTE]
I upgraded to 344.75 and haven't noticed any problems running mfaktc on my dual GTX 560s. |
[QUOTE=kladner;388258]I have a bit of suspicion about GeForce driver '344.75-desktop-win8-win7-winvista-64bit-international-whql'. Since I installed it, I have had a few instances of finding mfaktc not running. When I restart it, I find that the 580 has locked down to 400 MHz, requiring a reboot. It must be a couple of years since I saw such a problem.[/QUOTE]
Just as a heads-up, you can disable/re-enable the video card under device manager and accomplish the same thing most of the time. Saves the headache of having to restart the entire computer when you push the video card a little too far (as I often do...). |
[QUOTE=wombatman;388332]Just as a heads-up, you can disable/re-enable the video card under device manager and accomplish the same thing most of the time. Saves the headache of having to restart the entire computer when you push the video card a little too far (as I often do...).[/QUOTE]
Thanks for the tip! :smile: |
I found a "bug".
While double checking some trial factoring assignments in another thread, mfaktc picked 75bit_mul32 for Factor=N/A,54820379,63,66 instead of the usual barrett76_mul32_gs. It ran about 1/7th the speed that 64,66 assignments run since it was sieving on the CPU. I think the default should be to always GPU sieve, or if the CPU sieve is to be used, to split the assignment at the cross over point. |
Apparently there are new versions approaching.
In the Green camp TheJudger, in the Red (and the rest) camp BDot. I am aware of some negative views on the Prime95 Throttle mechanism, but I do rely on having that for a notebook mounted atop a fan. Prime95 is deployed, sometimes with throttle, when using mfaktc would be unsustainable in the current AU summer conditions. I have to actively adjust load based on weather forecasts to prevent error exits from mfaktc. Would a similar style of throttle facility be feasible for mfaktx? Even better if it could track system temperature! P.S. I am aware of discussion in another thread about faulty notebook results, yet surely better if high temperature conditions can be managed. |
Since you started factoring at 2[sup]63[/sup] the CPU sieve must be used, at least for 2[sup]63[/sup]-2[sup]64[/sup]. I suspect that you may have "stages=0" set in mfaktc.ini which [i]prevents[/i] the assignment from being split at the crossover point.
|
[QUOTE=James Heinrich;393312]Since you started factoring at 2[sup]63[/sup] the CPU sieve must be used, at least for 2[sup]63[/sup]-2[sup]64[/sup]. I suspect that you may have "stages=0" set in mfaktc.ini which [i]prevents[/i] the assignment from being split at the crossover point.[/QUOTE]
Nope. mfaktc is splitting assignments just fine above 67. |
In any case it shouldn't be an issue for most people, since the whole PrimeNet range is long since factored well beyond 2[sup]64[/sup] -- even up to M2[sup]32[/sup] should be finished within 6 months or so. If you're redoing old TF for some reason, just split your assignments manually for now :smile:
|
[QUOTE=James Heinrich;393316]In any case it shouldn't be an issue for most people, since the whole PrimeNet range is long since factored well beyond 2[sup]64[/sup] -- even up to M2[sup]32[/sup] should be finished within 6 months or so. If you're redoing old TF for some reason, just split your assignments manually for now :smile:[/QUOTE]
Indeed. I'm not really worried about it. It's the first time I've done work at bit levels below 68 :D |
[QUOTE=Mark Rose;393245]I found a "bug".
While double checking some trial factoring assignments in another thread, mfaktc picked 75bit_mul32 for Factor=N/A,54820379,63,66 instead of the usual barrett76_mul32_gs. It ran about 1/7th the speed that 64,66 assignments run since it was sieving on the CPU. I think the default should be to always GPU sieve, or if the CPU sieve is to be used, to split the assignment at the cross over point.[/QUOTE] Works as designed. :smile: No barrett based kernel in mfaktc 0.20 is able to handle FCs below 2[SUP]64[/SUP]. So it isn't a matter of GPU or CPU sieving. There are only the "old" schoolbook-division kernels which can do lower FCs. In 0.20 non of these kernels is GPU sieve enabled. This will change in 0.21. One could try to improve the automatic splitting of bitlevel, ofcourse. Oliver |
[QUOTE=TheJudger;393348]One could try to improve the automatic splitting of bitlevel, of course.[/QUOTE]That's the key. I'm not sure what the current splitting logic is but if it's relatively easy it would be great to forcibly split the assignment at [color=blue][i]CPU/GPU-sieving cutoff[/i][/color] (currently 2[sup]64[/sup]), at least if stages=1.
Certainly not a big issue considering everything (in PrimeNet at least) is long since done beyond 2[sup]64[/sup], but I don't think it would be a bad idea to add it anyways if the code change is simple. |
[QUOTE=James Heinrich;393349]
everything (in PrimeNet at least) is long since done beyond 2[sup]64[/sup][/QUOTE] Well, not quite... There are stiil more than 70K numbers trial factored to 64 bits or less. Granted the exponents are all < 5M, but still some people are working on them and finding many new factors :geek:. GPU sieving would be a most welcome helping hand, specially if it could be used to TF under 1M as well. |
[QUOTE=lycorn;393366]Well, not quite... There are stiil more than 70K numbers trial factored to 64 bits or less. Granted the exponents are all < 5M...[/QUOTE]Fair point, I stand corrected.
74759 unfactored exponents as I count them, and 319 of them are over 5M (between [url=http://mersenne.ca/M5215801]5,215,801[/url] and [url=http://www.mersenne.ca/M5325181]5,325,181[/url]):[code]+----------+--------+ | count(*) | mrange | +----------+--------+ | 16976 | 0 | | 15886 | 1 | | 21636 | 2 | | 16963 | 3 | | 2979 | 4 | | 319 | 5 | +----------+--------+[/code] |
[QUOTE=James Heinrich;393374]Fair point, I stand corrected.
74759 unfactored exponents as I count them, and 319 of them are over 5M (between [url=http://mersenne.ca/M5215801]5,215,801[/url] and [url=http://www.mersenne.ca/M5325181]5,325,181[/url]):[/QUOTE] [URL="http://www.mersenne.org/report_factoring_effort/?exp_lo=5215801&exp_hi=5325181&bits_lo=1&bits_hi=64&tftobits=72"]Those have been factored up to 2^65.[/URL] I guess it didn't get passed to mersenne.ca. |
Jayder is right.
If you query Primenet directly, via Reports -> Detailed Reports -> Factoring Limits, you´ll get (as of 09:48 UTC): 61 bits - 13250 62 bits - 3257 63 bits - 53708 64 bits - 188 Total - 70403 The highest is 4699963, TFed to 63 bits. |
Since this falls under the hardware heading, I thought I'd drop it here, with a shout out to LaurV.
[URL="http://www.ebay.com/itm/2-X-EVGA-GTX-580-FTW-HYDRO-1-with-additional-Koolance-liquid-cooling-block-/291363624541?pt=LH_DefaultDomain_0&hash=item43d6a0165d"]Here[/URL], and [URL="http://www.ebay.com/itm/EVGA-GTX-580-FTW-HYDRO-with-Koolance-liquid-cooling-block-installed-/301502794514?pt=LH_DefaultDomain_0&hash=item4632f78b12"]here[/URL] are two pairs of EVGA water cooled 580s. :razz: Just saying..... |
[QUOTE=kladner;393482]with a shout out to LaurV. [URL="http://www.ebay.com/itm/2-X-EVGA-GTX-580-FTW-HYDRO-1-with-additional-Koolance-liquid-cooling-block-/291363624541?pt=LH_DefaultDomain_0&hash=item43d6a0165d"]Here[/URL], and [URL="http://www.ebay.com/itm/EVGA-GTX-580-FTW-HYDRO-with-Koolance-liquid-cooling-block-installed-/301502794514?pt=LH_DefaultDomain_0&hash=item4632f78b12"]here[/URL] are two pairs of EVGA water cooled 580s. :razz:
Just saying.....[/QUOTE] There is only one pair, sold by the same guy, once piece by piece (at one link) and then both together (second link). But the same pair. OTOH, I still have one pair of 580's full water cooled in the drawers, under the desk, for which no mobo/cpu/box is yet available (have the 1.2KW power supply, new, tho). If I buy one more mobo/box I risk that mrs LaurV moves me into the garden... I am seriously thinking to give them away, if someone can find a good home for them. By good home I mean cheap electricity, and TF or LL production for at least a while.... |
[QUOTE=LaurV;393486]There is only one pair, sold by the same guy, once piece by piece (at one link) and then both together (second link). But the same pair.
OTOH, I still have one pair of 580's full water cooled in the drawers, under the desk, for which no mobo/cpu/box is yet available (have the 1.2KW power supply, new, tho). If I buy one more mobo/box I risk that mrs LaurV moves me into the garden... I am seriously thinking to give them away, if someone can find a good home for them. By good home I mean cheap electricity, and TF or LL production for at least a while....[/QUOTE] I had a 580 which died, haven't replaced it yet and when I do it won't be another 580. 580 - 244w - 433 days/day 980 - 165w - 523 days/day I could even run two 960's for the electricity of one 580 and get an extra 80 days/day Oh yes, forgot to mention the 580 sounds like a jet aircraft taking off :geek: and they are all quite old now... |
[QUOTE=Gordon;393505]980 - 165w - 523 days/day[/QUOTE]You should be able to get ~550 to 625 a day with a 980.
We have posted wattage and performance figures here: [URL]http://www.mersenneforum.org/showthread.php?t=19910[/URL] |
[QUOTE=LaurV;393486]There is only one pair, sold by the same guy, once piece by piece (at one link) and then both together (second link). But the same pair.
OTOH, I still have one pair of 580's full water cooled in the drawers, under the desk, for which no mobo/cpu/box is yet available (have the 1.2KW power supply, new, tho). If I buy one more mobo/box I risk that mrs LaurV moves me into the garden... I am seriously thinking to give them away, if someone can find a good home for them. By good home I mean cheap electricity, and TF or LL production for at least a while....[/QUOTE] I've always wanted to try water cooling... |
Compute Capability 3.[B][COLOR="Red"]7[/COLOR][/B] anyone? Seems I was right, Tesla K80 is 3.7. So K80 is not Titan Z?!
K80 has two chips with 2496 cores each (13 * 192), AFAIK the chip doesn't have more, but I'm not sure. At least it shouln't be a GK110(b) like Tesla K20/K40, GTX 780 (Ti), GTX Titan (Black). [CODE]CUDA device info name Tesla K40m compute capability 3.5 max threads per block 1024 max shared memory per MP 49152 byte number of multiprocessors 15 CUDA cores per MP 192 CUDA cores - total 2880 [/CODE] vs. [CODE]CUDA device info name Tesla K80 compute capability 3.7 max threads per block 1024 max shared memory per MP 114688 byte number of multiprocessors 13 CUDA cores per MP 192 CUDA cores - total 2496 [/CODE] More shared memory and (not visible in mfaktc output) double amount of registers per multiprocessor. Shared memory is 64kiB - 16kiB = 48kiB for 3.5 and 128kiB - 16kiB = 112kiB for 3.7. So either they had those extra registers and shared memory unused all the time for other products or they really built a new chip just for Tesla K80. While chipmakers tend to built as few as possible chips this makes we wonder... AFAIK nvidia sells[LIST][*]lots of Geforces[*]some Quadros[*]very few Teslas[/LIST](relative numbers) And the new chip is a Kepler architecture, not Maxwell... I'm really curios! Oliver |
I extracted mfaktc 0.20 to a directory. As far as I can tell there is no need to recompile any source. If I enter
./mfaktc.exe -h it returns error while loading shared libraries: libcudart.so.4: cannot open shared object file: No such file or directory That file is in the subdirectory 'lib'. What is the advised way to proceed? |
[QUOTE=tha;393855]I extracted mfaktc 0.20 to a directory. As far as I can tell there is no need to recompile any source. If I enter
./mfaktc.exe -h it returns error while loading shared libraries: libcudart.so.4: cannot open shared object file: No such file or directory That file is in the subdirectory 'lib'. What is the advised way to proceed?[/QUOTE] try running sudo ldconfig if that fails, make sure that libcudart.so.4 is found in one of the directories listed in the files under /etc/ld.so.conf.d and if not add the missing directory, and then run sudo ldconfig again. |
tha:
[CODE]LD_LIBRARY_PATH="./lib/" ./mfaktc.exe -h[/CODE] |
works like a charm.
|
- What does the -v(erbose) option do, I did not notice a difference between 0 and 1?
- How do I limit the output to say once per 20% progress instead of 0,1%? |
[QUOTE=TheJudger;393851]Compute Capability 3.[B][COLOR="Red"]7[/COLOR][/B] anyone? Seems I was right, Tesla K80 is 3.7. So K80 is not Titan Z?![/QUOTE]I've added some Tesla's to my [url=http://www.mersenne.ca/mfaktc.php]mfaktc performance chart[/url]. Assuming mfaktc performance is the same between Compute 3.5 and 3.7 K80 comes slightly ahead of Titan-Z for only 300% the price :smile:
Since you seem curious about it, [url=http://www.anandtech.com/show/8729]this article[/url] covers the K80 specifically. |
[QUOTE=tha;393867]- What does the -v(erbose) option do, I did not notice a difference between 0 and 1?[/QUOTE]
Changes the verbosity (NOT the per class status line) [QUOTE=tha;393867]- How do I limit the output to say once per 20% progress instead of 0,1%?[/QUOTE] Actually it is every ~0.1041667% (exact 1/960) but that's not the point. You can't change without code changes. Oliver |
1 Attachment(s)
Now that I am once again running a GTX 580 and a 570, I am reminded of an odd-seeming phenomenon. In the attached shot, one can see that the 570 actually starts a bit faster than the 580 (it is clocked higher.) In the first few minutes it looses about 20 GHz-d/d, while the 580 holds fairly steady. This is without P95 running, which takes off about 5 GHz-d/d from both processors. That I can live with, though it is puzzling, as mfaktc instances seem to use no more that 0.06% CPU time.
The more substantial reduction on the secondary card (570) bothers me more. Both cards are reported running at 16x PCIe. The 570 runs ~10 C cooler than the 580, (70 vs. 80 C), so it doesn't seem that heat should be involved. Is this a known behavior of which I have missed the discussion? Could it be specific to the GTX 570 architecture? I'd love to get that 20 GHz-d/d full time, if I can. |
1 Attachment(s)
Something similar happens with the 570 running solo. The GHz-d/d stabilized at ~434 with Sieve Primes at both 82485 (default,) and 75061.
|
Damn... I gotta look into overclocking my 580's.
|
[QUOTE=kladner;394286]Is this a known behavior of which I have missed the discussion? Could it be specific to the GTX 570 architecture? I'd love to get that 20 GHz-d/d full time, if I can.[/QUOTE]
I never met that phenomenon. I "cycled" many 580's through my hands, and few 570's too, and currently have 4 pieces of 580's, all factory overclocked at 781MHz, all water cooled, running very stable at 428-433 GHzD/D, with P95 running. Unfortunately they are not mfaktc-ing all the time, sometime they do real life work... |
[QUOTE=Mark Rose;394295]Damn... I gotta look into overclocking my 580's.[/QUOTE]
This one is factory OC at 797 MHz. It can run at 844 solo, though as the top card in the case, it starts creeping past 85 C with both running. :no: EDIT: That is all at stock voltage. As the only card where heat was less of a problem, I had an Asus 580 which would run at 861 @ stock V, and 872 @ the smallest available voltage increment up. Unfortunately, it suffered some kind of damage during a major case reorganization. It will still boot into Windows, but falls apart if I try to run mfaktc on it. I'm now running an EVGA card with an aftermarket cooler. |
[QUOTE=kladner;394298]This one is factory OC at 797 MHz. It can run at 844 solo, though as the top card in the case, it starts creeping past 85 C with both running. :no:
EDIT: That is all at stock voltage. As the only card where heat was less of a problem, I had an Asus 580 which would run at 861 @ stock V, and 872 @ the smallest available voltage increment up. Unfortunately, it suffered some kind of damage during a major case reorganization. It will still boot into Windows, but falls apart if I try to run mfaktc on it. I'm now running an EVGA card with an aftermarket cooler.[/QUOTE] How do you get them so cold? :D My top GTX 580 at 772 MHz is running at 89C. My bottom GTX 580 at 797 MHz is running at 92C (factory OC). Both are shrouded EVGA devices. |
[QUOTE=Mark Rose;394299]How do you get them so cold? :D
My top GTX 580 at 772 MHz is running at 89C. My bottom GTX 580 at 797 MHz is running at 92C (factory OC). Both are shrouded EVGA devices.[/QUOTE] I've never had a shrouded card (assuming that means "single blower, rear exhaust"). I am a little surprised at those numbers. I have always assumed that, with less GPU heat dumped in the case, such cards have an advantage. (I also thought that Canadian room temperatures would keep those bad boys cool as cucumbers. :razz:) I currently have two 140mm front intakes, and a 140mm side intake. I try to maintain positive pressure in the case, so a bit of hot air gets forced out directly from the cards. I have two 80mm Antec "spot fans" in the case. For a long time, both of those were set to blow air between the cards. Right now, I am experimenting with the spot fan which is right in line with the side intake blowing in, and the one nearer the front of the case drawing air out from between and throwing it upward. The exhausts are a 140mm top rear, two 120mm radiator fans blowing out the top, and the PSU. Consequently, I'm only dealing with GPU heat and some minor sources in the case. My present setup is the result of obsessive tweaking, especially of the spot fans. Positioning relative to the GPU fans makes a big difference. Without knowing what your arrangement is, I would venture that forcing more air into the case, either from the front or the side panel, might help. Water cooling the CPU with the radiator fan(s) exhausting would reduce internal heat load, but you have to have enough intake flow to feed all those "suckers." Then too, water cooling plates for the GPUs turn up fairly often on Ebay. Even doing one of the cards could make a difference. It is weird, to me, that the bottom card is hotter. That makes it seem that it may be starved for fresh air. I have tried for a long time to imagine an air source whose output matches the critical "between the cards" space. Given the limited space between the tops of the cards and the side panel, this would almost have to be a centrifugal blower with a customized duct shaped to the narrow opening. If I went that far, it would make the most sense to arrange the blower to draw outside air. |
[QUOTE=kladner;394317]I also thought that Canadian room temperatures would keep those bad boys cool as cucumbers. :razz:[/QUOTE]My pair of GTX 670s are running at 73°C / 75°C (the hotter one is factory-clocked a bit faster), if that makes you feel any better.
I'm also sitting here in a coat and toque, with a portable heater pointed at my keyboard to keep my fingers defrosted. :rajula: |
2 Attachment(s)
[QUOTE=kladner;394317]I've never had a shrouded card (assuming that means "single blower, rear exhaust"). I am a little surprised at those numbers. I have always assumed that, with less GPU heat dumped in the case, such cards have an advantage. (I also thought that Canadian room temperatures would keep those bad boys cool as cucumbers. :razz:)[/quote]
This is the office computer. The room temperature is about 22°C. Shroud coolers make sense in theory, in that they dump air outside of the case, but the amount of air they move is less than two or three fan coolers. Plus they are noisy (centrifugal fans are kind of shaped like air raid sirens). [quote] I currently have two 140mm front intakes, and a 140mm side intake. I try to maintain positive pressure in the case, so a bit of hot air gets forced out directly from the cards. I have two 80mm Antec "spot fans" in the case. For a long time, both of those were set to blow air between the cards. Right now, I am experimenting with the spot fan which is right in line with the side intake blowing in, and the one nearer the front of the case drawing air out from between and throwing it upward. The exhausts are a 140mm top rear, two 120mm radiator fans blowing out the top, and the PSU. Consequently, I'm only dealing with GPU heat and some minor sources in the case. My present setup is the result of obsessive tweaking, especially of the spot fans. Positioning relative to the GPU fans makes a big difference. Without knowing what your arrangement is, I would venture that forcing more air into the case, either from the front or the side panel, might help. Water cooling the CPU with the radiator fan(s) exhausting would reduce internal heat load, but you have to have enough intake flow to feed all those "suckers." [/quote] I have a 140 mm exhaust fan at the top back, a 140 mm intake in front of the power supply, a 140 mm intake in front of the upper 3.5" bays, and a 140 mm intake on the door that blows approximately between the two cards (perhaps a little more on the top one). I've removed all the extra PCI covers to let air blow over the cards and to increase airflow through the case. The CPU, a 4770, run 3 cores of mprime. The power supply vents out the back but also produces about 150 watts of waste heat to produce the 600 or so watts consumed by the system, and some of that radiates inside the case. Its top side is also cooled somewhat by the airflow of the door fan. The air coming out of the cards feels approximately the same speed as the air coming out of the PCI slots. [quote] Then too, water cooling plates for the GPUs turn up fairly often on Ebay. Even doing one of the cards could make a difference. [/quote] I looked into it when LaurV was threatening to give away two water cooled GTX 580's. The cost of the additional water cooling components would be almost as much as buying two new GTX 580's. [quote] It is weird, to me, that the bottom card is hotter. That makes it seem that it may be starved for fresh air. [/quote] It's clocked slightly higher and has hot components on both sides. The top card only had the bottom card to contend with. The CPU heatsink is relatively cool in comparison. Oh, and the bottom card is a PNY. I thought it was EVGA. I bought it and plugged it in and forgot about it :D |
Wow! That's a Fractal Design R4! That's what I have. I used to have the bottom intake until I got a monster PSU that was longer than I expected. I am hoping that I can turn my large drive bay back the way you have yours. I had to turn it sideways while I had the Asus 580 in residence. I think I would get more air through, even with all the drives in there.
|
[QUOTE=James Heinrich;394328]My pair of GTX 670s are running at 73°C / 75°C (the hotter one is factory-clocked a bit faster), if that makes you feel any better.
I'm also sitting here in a coat and toque, with a portable heater pointed at my keyboard to keep my fingers defrosted. :rajula:[/QUOTE] OW! This Texas Gulf Coast native would suffer under those conditions. Chicago stresses me enough as it is. |
[QUOTE=kladner;394391]Wow! That's a Fractal Design R4! That's what I have. I used to have the bottom intake until I got a monster PSU that was longer than I expected. I am hoping that I can turn my large drive bay back the way you have yours. I had to turn it sideways while I had the Asus 580 in residence. I think I would get more air through, even with all the drives in there.[/QUOTE]
Yep! And my desktop at home is encased in a white R4. Good case. Turning the drive cage and putting the front fan in the upper position helped tremendously with air flow. More than I expected it would. I had to move the rubber grommets in the sleds to the other position to get them to sit deep enough to not collide with the 580's. I'm not sure how well it would work if there were actually drives in the cage. At home I have the cage populated and turned in the "normal" position, but I've also only got a third of the heat to deal with. |
1 Attachment(s)
I think I have pretty much solved the issue of the GTX 570 slowing down 20-22 GHz-d/d from its starting speed. First, I experimented extensively with GPU Sieve Primes. While the 580 seems to do best at the default of 82485, the 570 proved to like 55605 better. 45365 is too low.
I have long run GPUSieveSize=128, and that turns out to be the best for both cards. The change which really reduced the speed drop was changing GPUSieveProcessSize from 8 to 16. The top speed fell off at 24. 16 also benefited the 580, but only by a few GHz-d/d. However, though throughput was improved, the temperature actually dropped a degree C or two. The upshot is that the cards are running such that their total throughput is now about 915 GHz-d/d instead of the ~860 I was getting before, while the temps are slightly better. The 850 is now running at 823 MHz at 81 C, and producing 545 GHz-d/d. The 570 at 872 MHz is putting out ~452 GHz-d/d and holding at 70 C. While the 570 is the bottom card, it is interesting that it is running 11 C cooler for essentially the same throughput. The attachment shows the 570 at startup with P95 fully engaged with two DCs and six P-1s, which include two or three doing Stage 2, with the assigned memory fully occupied. As I said before, P95 takes off about 5-7 GHz-d/d off for each GPU, but this is the best tuned I've ever had the combination of GPUs and CPU. |
[QUOTE=kladner;394499]The change which really reduced the speed drop was changing GPUSieveProcessSize from 8 to 16. The top speed fell off at 24. 16 also benefited the 580, but only by a few GHz-d/d. However, though throughput was improved, the temperature actually dropped a degree C or two. The upshot is that the cards are running such that their total throughput is now about 915 GHz-d/d instead of the ~860 I was getting before, while the temps are slightly better. The 850 is now running at 823 MHz at 81 C, and producing 545 GHz-d/d. The 570 at 872 MHz is putting out ~452 GHz-d/d and holding at 70 C. While the 570 is the bottom card, it is interesting that it is running 11 C cooler for essentially the same throughput.[/QUOTE]
Have you had any success in lowering the temperatures by reducing the memory clock? I've not played around with the clocks in my graphics cards, but I'm getting tempted to install the updated Nvidia drivers for Linux that enable clock changing. I'm only getting 424 and 428 or so out of my GTX 580's. Seems I am leaving a lot on the table! |
[QUOTE=kladner;394499]The change which really reduced the speed drop was changing GPUSieveProcessSize from 8 to 16. The top speed fell off at 24. 16 also benefited the 580, but only by a few GHz-d/d. However, though throughput was improved, the temperature actually dropped a degree C or two.[/QUOTE]
Changing GPUSieveProcessSize from 8 to 16 slowed down each half of my 690 by 30 for a total loss of 60 GHzD/D. Wish we had a configuration utility that would work through all the various settings and come up with the best setting. |
[QUOTE=Mark Rose;394506]Have you had any success in lowering the temperatures by reducing the memory clock? I've not played around with the clocks in my graphics cards, but I'm getting tempted to install the updated Nvidia drivers for Linux that enable clock changing. I'm only getting 424 and 428 or so out of my GTX 580's. Seems I am leaving a lot on the table![/QUOTE]
The memory is at 1598 MHz for both cards. I would have to experiment to see how much effect that is having. I have had that setting since hot weather. |
[QUOTE=kladner;394499][B]The 850 is now running at 823 MHz at 81 C, and producing [U]545[/U] GHz-d/d.[/B].[/QUOTE]
Oops. I think I transposed numbers. I have the.....wait a minute. I don't think I would have committed [I]two transpositions [/I]in the same sentence. The [B][U]580[/U][/B] is indeed running at 823 MHz ATM, and putting out [B][U]456[/U][/B] GHz-d/d. It is at 80 C. Until I changed it for this test, it was running at 844 MHz, mfaktc reporting 469 GHz-d/d, at 81 C. :smile: |
[QUOTE=Mark Rose;394299][B]How do you get them so cold? :D[/B]
My top GTX 580 at 772 MHz is running at 89C. My bottom GTX 580 at 797 MHz is running at 92C (factory OC). Both are shrouded EVGA devices.[/QUOTE] Looking at your case photos and reading your descriptions, I can say that one big factor is that I have more powerful case fans. I have both front fan slots filled with [URL="http://www.newegg.com/Product/Product.aspx?Item=N82E16835553007"][B]Cougar[/B] [/URL]1200 rpm fans, which are relatively subdued but put out a lot. They have a lot of the aerodynamic bells and whistles, and built-in anti-vibration mounts. These are kind of my basic minimum intake fans. I have them hooked up to the built-in fan controller. But before that, I would say to put something even stronger in the side port. By this I mean something like [URL="http://www.newegg.com/Product/Product.aspx?Item=N82E16835608044"]Noctua[/URL] or [URL="http://www.newegg.com/Product/Product.aspx?Item=N82E16835132023"]BGears[/URL].[URL="http://www.newegg.com/Product/Product.aspx?Item=N82E16835608044"] [/URL] You may think this will add a lot of noise. I would argue that it will be less offensive noise than your centrifugal sirens. Also, this (side) fan would not be running flat out, unless you wanted it to. Adding more power to the bottom intake couldn't hurt, either. At this point you would have so much air coming in that the case would be pressurized, and perhaps feed the GPUs more forcefully. |
[QUOTE=kladner;394681]Looking at your case photos and reading your descriptions, I can say that one big factor is that I have more powerful case fans. I have both front fan slots filled with [URL="http://www.newegg.com/Product/Product.aspx?Item=N82E16835553007"][B]Cougar[/B] [/URL]1200 rpm fans, which are relatively subdued but put out a lot. They have a lot of the aerodynamic bells and whistles, and built-in anti-vibration mounts. These are kind of my basic minimum intake fans. I have them hooked up to the built-in fan controller.
But before that, I would say to put something even stronger in the side port. By this I mean something like [URL="http://www.newegg.com/Product/Product.aspx?Item=N82E16835608044"]Noctua[/URL] or [URL="http://www.newegg.com/Product/Product.aspx?Item=N82E16835132023"]BGears[/URL].[URL="http://www.newegg.com/Product/Product.aspx?Item=N82E16835608044"] [/URL] You may think this will add a lot of noise. I would argue that it will be less offensive noise than your centrifugal sirens. Also, this (side) fan would not be running flat out, unless you wanted it to. Adding more power to the bottom intake couldn't hurt, either. At this point you would have so much air coming in that the case would be pressurized, and perhaps feed the GPUs more forcefully.[/QUOTE] The four case fans are [url=http://www.fractal-design.com/home/product/case-fans/silent-series-r2-140mm]Fractal Design Silent Series R2 140mm[/url], pushing up to 66 CFM, at under 19 dB. With three intakes, that's about 200 CFM. Come to think of it, I could get more air flow by removing the dust filters, which actually capture quite a bit. The R4 case shields most of the centrifugal sirens' noise. I would say over half the subjective sound is the rushing of air in general. The most noticeable fan now is the power supply fan, which buzzes a little. |
Okay can the title changes *please* stay in the Soap Box or otherwise near the bottom of the front page list? This is a bit ridiculous.
|
[QUOTE=Dubslow;394769]Okay can the title changes *please* stay in the Soap Box or otherwise near the bottom of the front page list? This is a bit ridiculous.[/QUOTE]
+1 ! |
[QUOTE=Mark Rose;394740]The four case fans are [URL="http://www.fractal-design.com/home/product/case-fans/silent-series-r2-140mm"]Fractal Design Silent Series R2 140mm[/URL], pushing up to 66 CFM, at under 19 dB. With three intakes, that's about 200 CFM. Come to think of it, I could get more air flow by removing the dust filters, which actually capture quite a bit.
The R4 case shields most of the centrifugal sirens' noise. I would say over half the subjective sound is the rushing of air in general. The most noticeable fan now is the power supply fan, which buzzes a little.[/QUOTE] OK. I was putting too much emphasis on the blower noise. It may be that the shrouded design just runs hotter than the dump-it-inside kind, though I have to push a lot of air to keep the kind of cards I have happy. I am also fairly certain the my decreased hearing is shielding me from some of that rushing sound. My partner says I'm not happy with a computer unless I can hear the fans. However, I do keep the GPU temps down. |
[QUOTE=Dubslow;394769]Okay can the title changes *please* stay in the Soap Box or otherwise near the bottom of the front page list? This is a bit ridiculous.[/QUOTE]
+1 To be fair I WAS secretly hoping to see trivial division but I agree that this stuff is meant to be serious. Some people legitimately looking for information might not know to look here when the title is "trivial derision with ZULU (yes, in fact)" |
I guess I have become inured to the shifting tidals. I had not even noticed the last couple of twists. I guess I just recognize the general shape or rhythm of the title. All that aside, I [I]strongly[/I] agree that it would be better to leave more technical sections and threads with truly descriptive titles.
|
I'd be careful about demanding stability in the thread title -- we're liable to end up with something like, "Troll division." (AFAIK, that could have happened already somewhere along the way...)
As a prior victim of the forum gods in this respect, I know that they can become particularly, uh, [I]playful[/I] and [I]mischievous[/I] if you ask them to behave otherwise. :smile: Rodrigo |
[QUOTE=Rodrigo;394881]I'd be careful about demanding stability in the thread title -- we're liable to end up with something like, "Troll division." (AFAIK, that could have happened already somewhere along the way...)
As a prior victim of the forum gods in this respect, I know that they can become particularly, uh, [I]playful[/I] and [I]mischievous[/I] if you ask them to behave otherwise. :smile: Rodrigo[/QUOTE] Yes. Sometimes we suppose that [URL="http://en.wikipedia.org/wiki/The_Gods_Must_Be_Crazy"]"The Gods Must Be Crazy."[/URL] |
...continued: A Tale of Two GPU's
I am carrying this over from its misplaced location [URL="http://http://www.mersenneforum.org/showpost.php?p=395017&postcount=50"]here.[/URL]Well, the saga of running two GPUs takes another turn. Yesterday, I extensively rerouted power cables to clear obstacles from the airflow. In the course of this, I disconnected the PCIe cables from the GTX 570, the one with power connector troubles. Unfortunately, the bare pins, without the plastic surround, are tricky to line up with the cable connector. This translates to the unavoidable application of some force to the pins on the board, and predictably, things did not go well for some of the solder joints.
When I finally got things back together and started the machine, only the MSI GTX 580 appeared in BIOS or Windows. This necessitated pulling the 570, and installing the EVGA-based 580, which has an aftermarket Zalman cooler. I had been avoiding this card because it is triple slot, and because the cooler is marginal, at best. At least, the MSI has always been the coolest thing in the case, regardless of which position it is in. The EVGA has always been the hottest. ATM, I am running at the highest throughput I have ever achieved: 990 combined G-d/d. This comes from the EVGA at 891 MHz and 85 C, with the MSI at 900Mhz and [I]73[/I] C. I don't think I could have gotten the EVGA to that level and stayed in the mid 80s, if I had not realized that it defaults to the highest voltage I have ever seen on any GTX card: 1.088 V. The next step up is 1.100 V, which did not bode well for controlling temps. However, it turned out to run fine at 900 and default voltage. I may still try trimming the voltage to see how it does. When I noticed that EVGA was running a bit higher throughput than the MSI, I slowed EVGA to 891 M. This, plus some creative fan arranging, got it pegged at 85 C, or maybe a hair less if the heat is down in the apartment. I am still not really happy with the Zalman cooler. My last hope is that whoever installed it did a lousy job of heatsink compound application. I now aim to pull the cooler, deep clean the surfaces, and do my own HSC. I have Arctic Silver cleaners, 91% isopropyl alcohol, and acetone for removal, though the first two should really take care of it. On the application side, I have a choice between Arctic Silver 5, ShinEtsu X23-7783D, and Tuniq TX-4. I am leaning toward using the last of these. It will be interesting and gratifying if I can improve the performance of this supposedly decent cooler. It is pretty disappoint as things stand. :sad: |
[url]https://www.youtube.com/watch?v=Iegpwo9SqSg[/url]
|
[QUOTE=blip;395365][URL]https://www.youtube.com/watch?v=Iegpwo9SqSg[/URL][/QUOTE]
Wow! Just WOW! I thought I was into improvisation. I am utterly humbled by this guy. The cooler seems to be a tad bit of overkill, but WTH. Quite an interesting exercise. |
It would work better if the heatsink weren't upside down, too.
|
[QUOTE]My last hope is that whoever installed it did a lousy job of heatsink compound application.[/QUOTE]This turned out to be true. All cleaned up, with a proper thin layer of TX-4 has the EVGA stabilized at 76 C, versus 73 C for the MSI. w00t! That is really decent for the card on top. :smile:
EDIT: I dropped the voltage from 1.088 to the next step of 1.075. It remains to be seen if this is stable, but it did produce an immediate change from 76 to 75 C. :grin: EDIT2: It does seem that Zalman could have sprung for PWM fans and the connectors to plug them into the board's fan controller. Slightly more powerful fans would not have hurt, either. However, things are much better than first appeared. |
1 Attachment(s)
The attached shows the latest condition of the two cards. GPU1 is the EVGA, GPU2 is the MSI. Since the last post I have lowered the voltage on GPU1 another 'tick' to 1.063. Besides that, the room temperature is down from 22-23 C during the day, to about 18 C now. (It is currently a balmy 13 F outside, forecast to go to 9 F. This 1920s apartment building is not terribly well insulated. You can feel cold air washing down the walls which are on the periphery.)
Consequently, the cooler temps are partly from cooler air feeding in. However, it is more telling that the upper and lower cards are running at almost the same clocks, and are holding the same temperature. I'm surprised, after its initially disappointing cooling performance, that I've gotten the EVGA/Zalman combination down this far, especially considering the relatively high voltage it is still running at. So the Zalman turns out to be a fairly decent cooler, if it is properly interfaced to the chip. I still wish that it had automated fan control instead of the cheesy cable running out the back of the case to a little manual controller. I may try inserting a temperature sensor in the hottest part of the heatsink, so I can put it under motherboard control. |
Could a kinder mod please restore the title of this thread to "Trial division with CUDA (mfaktc)", and would all mods please stop changing thread titles of threads whose forum is displayed above the Soap Box on the default front page?
|
[QUOTE=Dubslow;395397]Could a kinder mod please restore the title of this thread to "Trial division with CUDA (mfaktc)", and would all mods please stop changing thread titles of threads whose forum is displayed above the Soap Box on the default front page?[/QUOTE]
Seconding the request. I understand it's part of the forum culture, but important threads should not be renamed without a good reason (such as making the title more descriptive). Otherwise, it could easily cause problems for people who are searching for serious threads. At the very least, the original title should be part of the new one to avoid confusion: [QUOTE]ChemTrails division with CUDA (mfaktc) [was: Trial division with CUDA (mfaktc)][/QUOTE] To be honest, the frivolous title changes have stopped being funny a long time ago for me. |
Some of them, especially "down south", can be quite funny. :smile: Anyways, I guess the current title is "good enough". It's more accurate but less historical.
|
[QUOTE=ixfd64;395445]ChemTrails division with CUDA (mfaktc) [was: Trial division with CUDA (mfaktc)].[/QUOTE]
It wasn't. It was "Trail division with CUDA (mfaktc)" for many months, if not years. |
good news, everyone!
So a new topic for this old thread? So here is a new version of mfaktc. Evolution, not revolution!
Highlights (for full changes check Changelog.txt):[LIST][*]added support for Wagstaff numbers: (2[SUP]p[/SUP] + 1)/3[*]added support for "worktodo.add"[*]enabled GPU sieving on CC 1.x GPUs[*]dropped lower limit for exponents from 1,000,000 to 100,000[/LIST] Special thanks goes to Jerry Hallett who spent a lot time on building and testing Windows binaries aswell as some hints for Windows compability. [B]Thank you Jerry![/B] I guess it is a little bit faster than 0.20 for the current Primenet wavefront on most, if not all, cards. [B]Please post your mileage.[/B] As usual: finish your current assignment with your current version of mfaktc, mfaktc will refuse checkpoint files from other version than itself... yes I know - I'm paranoid in this case! So I know what you want now: here are the files:[LIST][*]source code: [url]http://www.mersenneforum.org/mfaktc/mfaktc-0.21/mfaktc-0.21.tar.gz[/url][*]basic Windows package ([I]regular[/I] 32- and 64-bit mfaktc) - CUDA 6.5: [url]http://www.mersenneforum.org/mfaktc/mfaktc-0.21/mfaktc-0.21.win.cuda65.zip[/url][*]for those who (think) they need them - additional binaries for Windows - CUDA 6.5: [url]http://www.mersenneforum.org/mfaktc/mfaktc-0.21/mfaktc-0.21.win.cuda65.extra-versions.zip[/url] Download the basic package and than ADD these binaries to the folder. Includes [I]LessClasses[/I] builts and binaries for [I]Wagstaff[/I] numbers[*]Linux binaries - CUDA 6.5: [url]http://www.mersenneforum.org/mfaktc/mfaktc-0.21/mfaktc-0.21.linux64.cuda65.tar.gz[/url] compiled on openSUSE 12.2[/LIST] What's next (read: mfaktc 0.22)?[LIST][*]remove support for CC 1.x GPUs, Nvidia drops support for CC 1.x in CUDA 7.0 so I'll do so, too! Will remove lots of [CODE]#if (__CUDA_ARCH__ >= FERMI)[/CODE] in the code.[*]remove 71bit kernel, only choosen for CC 1.x GPUs, less code to manage![*]remove support for older CUDA versions, will remove lots of [CODE]#if (CUDART_VERSION >= ????)[/CODE] in the code. I'm not sure which version I'll require as baseline, perhaps I go as high as CUDA 6.0 or even 6.5... unsure here.[/LIST] So 0.22 will help coders, no users (directly)... sorry! Oliver |
Fantastic! Great work.
|
[QUOTE=TheJudger;395689]for those who (think) they need them - additional binaries for Windows - CUDA 6.5... Includes [I]LessClasses[/I] builts[/QUOTE]Thank you! :max::bow::w00t::banana:
I can now run 1 instance per GPU instead of 4 per GPU (on >1000M TF to 64) Actually on my GTX 580 I'm running 2 instances for now since each assignment takes 0.5s, but in truth the throughput isn't much different despite filling my GPU to 100% instead of 70%. I'll probably go back to 1/GPU when I plug a new 570 into that system later this week. [QUOTE=TheJudger;395689]remove support for CC 1.x GPUs remove support for older CUDA versions[/QUOTE]I'm totally fine with that. I'd be perfectly satisfied if you called 0.21 the last release for old CUDA and GPUs, and made 0.22+ for CUDA 7.0 only. Would probably greatly simplify your code and it's not like those who are stuck on old GPU and/or CUDA can't use mfaktc anymore, just no higher than 0.21 |
Does this mean we can expect mfaktc 1.0 in the not-so-distant future?
|
For anyone compiling on Ubuntu 14.04 with the packaged CUDA 5.5 provided by Canonical, I had to make these changes:
CUDA_DIR = /usr/ CUDA_LIB = -L$(CUDA_DIR)/lib/x86_64-linux-gnu/ And because I don't have a Maxwell, and this was failing with CUDA 5.5: #NVCCFLAGS += --generate-code arch=compute_50,code=sm_50 # CC 5.x GPUs will use this code Anyway, the new version sped up my cards by 1%: GTX 580: 424 -> 430 GTX 580: 428 -> 433 GTX 760: 259 -> 261 on DCTF 41M 70->71 work. I haven't checked the difference on the GT 430's and GT 520's. Excellent! |
[QUOTE=ixfd64;395699]Does this mean we can expect mfaktc 1.0 in the not-so-distant future?[/QUOTE]
*hmmmm* [B]NO![/B] What's so special in version 1.0? Oliver P.S. check README.txt, chapter "7 FAQ" for version numbers :smile: |
Got a 1% boost on a GT 430, too.
I tried tweaking the GPU* configuration parameters to see if their optimal values changed but they seem to be the same. Also, I noticed this is still in the mfaktc.ini file at least in the Linux source download: # GPU sieving is supported on GPUs with compute capability 2.0 or higher. # (e.g. Geforce 400 series or newer) |
Thank you!
Installed on win7-64 bit with GTX760 both selftest were passed ~2% more output (from 266 -> 271) (I changed 2 settings in the ini-file: GPUSieveSize to max 128 and GPUSieveProcessSize to min 8, ~10% more performance with my GTX760) greetings Matthias |
[QUOTE=TheJudger;395689]Special thanks goes to Jerry Hallett who spent a lot time on building and testing Windows binaries as well as some hints for Windows compability. [B]Thank you Jerry![/B][/QUOTE]
You are welcome, my pleasure! Awesome work Oliver! |
[QUOTE=MatWur-S530113;395705]Thank you!
Installed on win7-64 bit with GTX760 both selftest were passed ~2% more output (from 266 -> 271) (I changed 2 settings in the ini-file: GPUSieveSize to max 128 and GPUSieveProcessSize to min 8, ~10% more performance with my GTX760) greetings Matthias[/QUOTE] I've found that my GTX 760 also benefits from a slightly higher GPUSievePrimes value than default. I have mine set to 100000. |
[QUOTE=Mark Rose;395707]I've found that my GTX 760 also benefits from a slightly higher GPUSievePrimes value than default. I have mine set to 100000.[/QUOTE]
I quickly testet some values for GPUSievePrimes (k*10000 for k=5..13, only a few classes per test), but it seems that for my card the default-value of 82486 is indeed the best, Between 70000 and 100000 no measurable difference, larger differences to the default-value are resulting in a worse performance (but still not remarkable, maybe 1%) Very large and very low values for GPUSievePrimes slow down my card up to 50%. |
[QUOTE=MatWur-S530113;395712]I quickly testet some values for GPUSievePrimes (k*10000 for k=5..13, only a few classes per test),
but it seems that for my card the default-value of 82486 is indeed the best, Between 70000 and 100000 no measurable difference, larger differences to the default-value are resulting in a worse performance (but still not remarkable, maybe 1%) Very large and very low values for GPUSievePrimes slow down my card up to 50%.[/QUOTE] I'll fiddle with mine again to verify it's better than the default. |
[QUOTE=Mark Rose;395707]I've found that my GTX 760 also benefits from a slightly higher GPUSievePrimes value than default. I have mine set to 100000.[/QUOTE]
I've never really fiddled with the options but I believe you've done a bit more. Which options do I want to be messing around with to try to increase throughput? |
[QUOTE=TheMawn;395717]I've never really fiddled with the options but I believe you've done a bit more.
Which options do I want to be messing around with to try to increase throughput?[/QUOTE] GPUSieveProcessSize GPUSieveSize GPUSievePrimes I would do them in that order. |
[QUOTE=Mark Rose;395719]GPUSieveProcessSize
GPUSieveSize GPUSievePrimes I would do them in that order.[/QUOTE] Do they need to be bigger or smaller, or does it tend to be a bit of both? |
[QUOTE=TheMawn;395721]Do they need to be bigger or smaller, or does it tend to be a bit of both?[/QUOTE]
Depends on the card. |
[QUOTE=TheMawn;395721]Do they need to be bigger or smaller, or does it tend to be a bit of both?[/QUOTE]
(for maximum performance, in my experience) GPUSieveProcessSize: (probably) bigger GPUSieveSize: (probably) bigger GPUSievePrimes: just right Experiment to find the best values for your GPU, Mersenne number size, and bit depth. If you use your GPU to drive a display, responsiveness becomes a concern, too, and dialing back the first two values may be necessary. I use [URL="https://github.com/Mini-Geek/mfaktx-controller"]a tool I wrote[/URL] to automatically switch the GPUSieveSize between a few values to trade off responsiveness and speed, based on what I'm doing (when idle, 128; common usage, 32; most games, 8; some games, stopped). |
1 Attachment(s)
Thanks Oliver; I'm getting a welcome 1% increase in performance.
GTX690 GPUSieveProcessSize=8 GPUSieveSize=128 GPUSievePrimes=107500 |
Now, looks like mfaktc has been immortalized by Andy Warhol! ;-)
|
[QUOTE=Mark Rose;395704]Also, I noticed this is still in the mfaktc.ini file at least in the Linux source download:
# GPU sieving is supported on GPUs with compute capability 2.0 or higher. # (e.g. Geforce 400 series or newer)[/QUOTE] Good catch, this is a bug somehow! I would say a minor bug so don't expect a 0.21p1 just because of this. Oliver |
Renaming of the worktodo.txt file disappeared from the ini file. Is it gone from the features too? Something to do with the new ".add" feature? It would be a pity, because some of us use different worktodo files, changing the ini (possible with a batch) when we want the card only partial busy, etc.
|
Hi,
yes, worktodo.txt is now static, just like all the other files mfaktc.exe accesses. Last part of the first sentence is the reason why I did so. Can you simple rename the different worktodo files with the logic which renames the mfaktc.ini file? Oliver |
Yes, we can live with it.
Just installed it and it is faster for me too, about 1% faster on a Titan and on the 580 which paints the screens, and bout 2% faster on the other 580's, all with the default gpu sieve primes (the 82k and some). Didn't tune yet, 2:00 AM here... maybe during the weekend. Good job! :tu: |
1 Attachment(s)
Uh-oh: Symantec's Norton Power Eraser thinks that MFAKTC is bad:
I'm not worried, but I wonder what might be leading NPE to consider it "untrustworthy." Rodrigo |
[QUOTE=Rodrigo;395918]Uh-oh: Symantec's Norton Power Eraser thinks that MFAKTC is bad:
I'm not worried, but I wonder what might be leading NPE to consider it "untrustworthy." Rodrigo[/QUOTE] Norton Internet Security gave the same complaint. I believe that such flags are based on the application being unknown in the Norton Community database. There are no direct heuristics indicating malware aside from the file having very restricted distribution. |
I just tried 0.21 on a small exponent (~500K) from 61 to 62 bits. To my surprise, the usage of CPU was virtually 0%, as if the sieving was being entirely performed on the GPU. The usage of GPU was consistently at 99%, which is also in line with sieving performed on it According to the post by Oliver presenting 0.21 to the community, I thouhgt that feature (GPU Sieving for bit levels lower than 64 bits) was to be implemented on version 0.22 only. Is that so, or something changed the plans?
[I]EDIT: I just went back and couldn´t find the reference to sieving for < 64 bits implementation on ver 0.22 anymore. So probably it was already foreseen as a 0.21 feature and the reference was edited out of the post by the OP or some watchful moderator... [/I] All in all, another great improvement. Thanks a lot! |
Hi,
there is GPU sieving below 2[SUP]64[/SUP] in mfaktc 0.21 BUT this is not really a [U]fast[/U] kernel. Should be the [I]75bit_mul32_gs[/I] kernel. I guess you'll see a nice speed increase when going above 2[SUP]64[/SUP] when it switches to some barrett based kernel. If it is a modern GPU I *guess* you'll see 80-95% more throughput when going above 2[SUP]64[/SUP]. To be honest: I won't focus much on performance below 2[SUP]64[/SUP]. Oliver |
[QUOTE=TheJudger;395930]
To be honest: I won't focus much on performance below 2[SUP]64[/SUP]. Oliver[/QUOTE] Fair enough, I appreciate that. It´s already a good thing 0.21 allows <1M exponents and the sieving for lower bit levels, even if the performance is not really tuned. In fact, I have also used 0.21 for higher bit levels, and the throughput is way better. To give an idea: (GTX 560Ti, GPU@900MHz) 500 k , 61-62 bits: 130 GHz-d/d 500 k, 62-63 bits: 205 GHZ-d/d 500 k, 64-65 bits: 357 GHz d/d I´ll probably run some more benchies and keep you posted. |
[QUOTE=lycorn;395932]Fair enough, I appreciate that. It´s already a good thing 0.21 allows <1M exponents and the sieving for lower bit levels, even if the performance is not really tuned. In fact, I have also used 0.21 for higher bit levels, and the throughput is way better.
To give an idea: (GTX 560Ti, GPU@900MHz) 500 k , 61-62 bits: 130 GHz-d/d 500 k, 62-63 bits: 205 GHZ-d/d 500 k, 64-65 bits: 357 GHz d/d I´ll probably run some more benchies and keep you posted.[/QUOTE] I would go for[LIST][*]2[SUP]?[/SUP] to 2[SUP]64[/SUP] in a single step[*]2[SUP]64[/SUP] to 2[SUP]something above 64[/SUP] in the second step.[*]single bit levels if the time per class is at least 1 second or so.[/LIST]For below 2[SUP]64[/SUP] it might be worth trying the LessClasses version. Oliver |
Does someone have a clever workaround for [URL="https://connect.microsoft.com/VisualStudio/feedback/details/794991/c-error-directive-and-unix-line-endings-leads-to-an-unexpected-end-of-file"]this[/URL]?
As I do development on Linux I'll stay with UNIX line endings! Oliver |
[QUOTE=TheJudger;395936]Does someone have a clever workaround for [URL="https://connect.microsoft.com/VisualStudio/feedback/details/794991/c-error-directive-and-unix-line-endings-leads-to-an-unexpected-end-of-file"]this[/URL]?
As I do development on Linux I'll stay with UNIX line endings! [/QUOTE] Source control systems (such as GIT) often have an option to fix/change line endings on check-in, check-out automatically. |
| All times are UTC. The time now is 22:00. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.