mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2011-05-02, 15:32   #815
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

3·5·227 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
might adjusting SievePrimes help reduce the bottleneck?
Just let SievePrimes self-adjust to achieve minimal bottleneck.

If you can't get >90% GPU usage with a single instance, start up a second instance of mfaktc and you should get close to full GPU usage, with CPU load spread across multiple cores (and with the additional CPU power, SievePrimes should self-adjust higher than with a single instance, again giving overall better throughput).
James Heinrich is offline   Reply With Quote
Old 2011-05-02, 15:40   #816
Ungelovende
 
May 2008
Åsane, Bergen, Norway

3·5 Posts
Default

Quote:
Originally Posted by Xyzzy View Post
Is it difficult (heat, physical space, power, 16x slots) to get them in the case?
This is my setup:
Antec Performance One Series P193
MSI P67A-GD55 (B3) LGA 1155 Intel P67
Cheap cpu-cooler
ZOTAC ZT-50201-10P GeForce GTX 570
ASUS ENGTX570/2DI/1280MD5 GeForce GTX 570
Chieftec Nitro Series BPS-1200 1200W PSU
+ cheap RAM
+ Sandy 2600k

The temperatures:
http://img339.imageshack.us/i/tempsandy.png/

Difficult? I needed a beer and a screwdriver

Quote:
Is running them SLI an option?
Not shure. This computer is not used for gaming - crunching only. Never testet SLI.

Quote:
The CPU or the GPU?
The CPU (i7-920) is too slow to feed two GTX570
Attached Thumbnails
Click image for larger version

Name:	sandy.JPG
Views:	128
Size:	172.4 KB
ID:	6555  
Ungelovende is offline   Reply With Quote
Old 2011-05-02, 15:51   #817
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

3×2,083 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Just let SievePrimes self-adjust to achieve minimal bottleneck.

If you can't get >90% GPU usage with a single instance, start up a second instance of mfaktc and you should get close to full GPU usage, with CPU load spread across multiple cores (and with the additional CPU power, SievePrimes should self-adjust higher than with a single instance, again giving overall better throughput).
Aha, so that's how it works. That would explain why it takes a little while upon initial startup each time for M/s to peak.

Looking at the mfaktc console output, I see it is actually using SievePrimes=5000 at optimal speed. I wasn't paying attention to that figure on the console earlier, but presumably it was trying different values when the speed was initially lower.

Thanks!
mdettweiler is offline   Reply With Quote
Old 2011-05-02, 15:55   #818
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

21278 Posts
Default

Hi all!

Quote:
Originally Posted by nucleon View Post
Agreed - in time mfaktc needs to get automated submit/get work. Or better yet - integrated into prime95. Prime95 handles the submit/get work, and has a generic option to run gpu code as separate process.

But Oliver is but one person. I think we can handle the inconvenience of manual submit/get work for the moment while Oliver works on his priorities. Getting version 0.17 out and getting the code optimal/valid.
My priorities? Currently mfaktc has a low priority...
0.17 coding should be finished but I want to edit the README a little bit.


Quote:
Originally Posted by Xyzzy View Post
Random, late night questions:
2 - Given most of us run 2 instances, would an i3 be enough? One of them has the clockspeed and HT. Does HT make a difference? Does reduced L3 cache make a difference?
i3 should be OK for a single 570.

Quote:
Originally Posted by Xyzzy View Post
3 - Can you run 2 GPU cards and use them both? Is it SLI or separate?
Yepp, multiple GPUs are supported for a long time. But you can't use multiple GPUs for one instance. Each GPU needs at least on instance of mfaktc.

Quote:
Originally Posted by Xyzzy View Post
4 - How much more productive would a 580 be over a 570, noting it is ~$150 more?
Comparing 2 GPUs of the same compute capability is very easy. Relative raw GPU performance: (512cores * 1544MHz) / (480cores * 1464MHz) = ~1.125
So the RAW performance of a GTX 580 is equivalent to 1.125 GTX 570. Usually this means a little bit less sieving on the CPU so I would expect something like 10% more throughput.

Quote:
Originally Posted by Xyzzy View Post
Our 4 non-GPU quads will be retired within the next month.
Are they worth running LL and/or P-1 on CPU?


Quote:
Originally Posted by mdettweiler View Post
Hi guys,

Recently, I've been playing around a bit with mfaktc on a GTX 460 (768MB) running on a system with a stock Q6600 CPU. I'm currently taking M332228447 from 75 to 81 bits (79>80 in progress right now). With the default SievePrimes=25000, I'm getting speeds of about 109M/s. The CPU appears to be the bottleneck, with mfaktc using 100% of one core and 82% of the GPU. Does this sound like what I should be optimally getting on this GPU/CPU/exponent combination, or might adjusting SievePrimes help reduce the bottleneck?

Also: I've noticed that whenever I have to stop and restart mfaktc, when it first comes back it starts at only ~75M/s, and takes about 1-2 hours to work its way back up to the usual ~109M/s. Is this normal behavior? Does anyone know why it does this?
Yepp, a single core of your CPU is the bottleneck.
The behavior you've noticed the the adjustment of SievePrimes. Each time you restart mfaktc SievePrimes starts at 25000 (default configuration). On your setup I assume the it goes down to 5000 after a while. You can either add another core (start a second instance of mfaktc working on an other exponent) or set SievePrimes to 5000 in the mfaktc.ini to avoid the behavior (but still run CPU limited).

Btw. the raw GPU speed is not the perfect measurement for performance, take a look the the per-class-runtime.


Oliver
TheJudger is offline   Reply With Quote
Old 2011-05-02, 18:19   #819
Karl M Johnson
 
Karl M Johnson's Avatar
 
Mar 2010

19B16 Posts
Default

Quote:
Originally Posted by Xyzzy View Post
Is running them SLI an option?
From what I've heard, having two GPUs connected in SLI will not hurt compute applications.
Karl M Johnson is offline   Reply With Quote
Old 2011-05-02, 19:37   #820
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

3×2,741 Posts
Default

Quote:
Yepp, multiple GPUs are supported for a long time. But you can't use multiple GPUs for one instance. Each GPU needs at least on instance of mfaktc.
Do you have to assign each instance to a particular GPU or do the cards act like one big GPU bucket?

Quote:
Are they worth running LL and/or P-1 on CPU?
They are doing P-1 right now. Two of the boxes have 16GiB of RAM, one has 8GiB and one has 4GiB. They use almost the same amount of electricity as our two GPU boxes, so if we get rid of them we can maybe justify two more. They are OEM HP boxes so putting a GPU in them is pushing it thermally, physically and electrically.
Xyzzy is offline   Reply With Quote
Old 2011-05-02, 19:37   #821
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

3×2,741 Posts
Default

Quote:
This is my setup:
Thanks! The Newegg links are useful too!

We are trying to figure out if there is a motherboard (maybe an ASUS) that will take 2 of the 3 slot Asus 570 cards like we have now. We are happy with them so we want to stick with them.
Xyzzy is offline   Reply With Quote
Old 2011-05-02, 20:41   #822
steinrar
 
Sep 2008

2 Posts
Default

Xyzzy; I own that type of GPU you link to, but don't like it. It heat up other component inside unless you have enough fans pushing air into computer giving posetive pressure inside. I have a similar setup like Ungelovende with 2x570 cards that push most of the air out of the pc, much better I think. My 5 cent...
steinrar is offline   Reply With Quote
Old 2011-05-02, 21:01   #823
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

5·359 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Hi all!

My priorities? Currently mfaktc has a low priority...
0.17 coding should be finished but I want to edit the README a little bit.

Oliver
I have just asked if I should be Oliver's assistant and add the automatic assignments part to the code, copying of as much as possible from Prime95/mprime.

I hope to be the second body, and learn enough to contribute to the core code and/or algorithms at some point.

Eric Christenson
Christenson is offline   Reply With Quote
Old 2011-05-02, 21:31   #824
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

3·2,741 Posts
Default

Quote:
Xyzzy; I own that type of GPU you link to, but don't like it. It heat up other component inside unless you have enough fans pushing air into computer giving posetive pressure inside.
We have a big fan in the front of the case and a big fan on the ceiling of the case, so the air crosses the card, moves up to the CPU and out the top. The PSU heat is pushed out the bottom of the back. Our GPU temps under load rarely exceed 65°C, but we are not overclocking. (Even the "stock" overclock is mild.)

We would prefer a slimmer GPU but running one 24×7 under full load has to cause higher temps and lower life, right? We doubt these GPU cards were designed for a 100% duty cycle.
Xyzzy is offline   Reply With Quote
Old 2011-05-03, 00:00   #825
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

70316 Posts
Default

30 Hour validation test result:
no factor for M53953421 from 2^50 to 2^66 [mfaktc 0.16p1 75bit_mul32]
no factor for M53953421 from 2^66 to 2^67 [mfaktc 0.16p1 barrett79_mul32]
no factor for M53953421 from 2^67 to 2^68 [mfaktc 0.16p1 barrett79_mul32]
no factor for M53953421 from 2^68 to 2^69 [mfaktc 0.16p1 barrett79_mul32]
no factor for M53953421 from 2^69 to 2^70 [mfaktc 0.16p1 barrett79_mul32]
no factor for M53953421 from 2^70 to 2^71 [mfaktc 0.16p1 barrett79_mul32]
no factor for M53953421 from 2^71 to 2^72 [mfaktc 0.16p1 barrett79_mul32]
no factor for M53953421 from 2^72 to 2^73 [mfaktc 0.16p1 barrett79_mul32]
M53953421 has a factor: 16867347823849190640239
found 1 factor(s) for M53953421 from 2^73 to 2^74 [mfaktc 0.16p1 barrett79_mul32]
(This factor was known from my P-1 effort)(Is it worth telling Primenet I duplicated the work?)
no factor for M3321934241 from 2^76 to 2^77 [mfaktc 0.16p1 barrett79_mul32] (took 3 hours or so)
(Operation Billion Digits...I'll take that up to 2^82 or so in the next two weeks)

Noting that there are stock fan kits for GPUs -- in my personal experience, the fans die first if the chips are kept cool. Hard drives die. But at only $100 for the GTX440, I can't complain.... and with the temperature monitoring and throttling, they are designed to support hours at a time of full-throttle gaming.
Christenson is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 16:24.


Fri Jul 16 16:24:32 UTC 2021 up 49 days, 14:11, 1 user, load averages: 2.06, 1.72, 1.72

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.