mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2015-09-17, 18:59   #23
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

2·5·293 Posts
Default

Quote:
Originally Posted by flashjh View Post
Did you ever figure out the answer to your question. I just installed a Titan Z in a PCIe 3.0 socket and I'm no longer able to keep the card at 100%, even with several mfaktc instances. I'm wondering if I need to get a fast MB\CPU combo or if it's a limit of the PCIe 3.0 bus at this point?
Is your card running at 16x? Or are the PCIe lanes split with other cards?
Mark Rose is offline   Reply With Quote
Old 2015-09-17, 19:08   #24
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

23·419 Posts
Default

Man, don't waste the Z for factoring, give it a LL and a DC for the same exponent (to check if the residues match), one in each GPU, and let it go.
OTOH, what the "PerfCap Reason" says? You may not be able to max that card for power limitations or thermal/voltage, or whatever other reasons. Also, disable the DP in Nvidia Control Panel if you insist on doing TF with it (it almost doubles the speed).

Last fiddled with by LaurV on 2015-09-17 at 19:12
LaurV is offline   Reply With Quote
Old 2015-09-17, 19:21   #25
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11×101 Posts
Default

with GPU sieving enabled even PCIe x1 Gen 1 should be more than enough but I don't know for sure.

Oliver
TheJudger is offline   Reply With Quote
Old 2015-09-17, 19:35   #26
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

21438 Posts
Default

It's running at 16x, it hovers at @ 97% on both sides. No biggie, just wondering why it won't go to 100 (99%). I have plenty of power to keep the card @ full. I thought the same about the GPU Sieve, so I'll just let it go. Each side is already significantly faster than a 580.

I can run LL\DC on the card, but I had to remove my last TF 580 to put this card in, so if I move to LL, I'll not be doing any TF anymore. Do we have the running TF capacity to lose the 450 GHzDays\Day?
flashjh is offline   Reply With Quote
Old 2015-09-17, 19:43   #27
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

33×192 Posts
Default

Quote:
Originally Posted by flashjh View Post
Do we have the running TF capacity to lose the 450 GHzDays\Day?
No. Please.

We should be better off next week, but not today -- we need a bit more of a TF'ed buffer for both LL (various categories) and LLP-1.

I /really/ need to get out more....
chalsall is offline   Reply With Quote
Old 2015-09-17, 20:58   #28
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

Quote:
Originally Posted by chalsall View Post
No. Please.

We should be better off next week, but not today -- we need a bit more of a TF'ed buffer for both LL (various categories) and LLP-1.

I /really/ need to get out more....
Yes, agreed! Ok, no problem. I'll leave the 'Z' on TF. It's doing ~1150 GHzDays\Day as of right now. Hope to get a little more out of it, but I need to finish testing.
flashjh is offline   Reply With Quote
Old 2015-09-18, 01:47   #29
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

23·419 Posts
Default

Quote:
Originally Posted by flashjh View Post
Yes, agreed! Ok, no problem. I'll leave the 'Z' on TF. It's doing ~1150 GHzDays\Day as of right now. Hope to get a little more out of it, but I need to finish testing.
Now ye talking! I was going to say that 450 is a bit in the lower side for that card...
LaurV is offline   Reply With Quote
Old 2015-09-18, 11:32   #30
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

5,881 Posts
Default

Quote:
Originally Posted by flashjh View Post
Did you ever figure out the answer to your question. I just installed a Titan Z in a PCIe 3.0 socket and I'm no longer able to keep the card at 100%, even with several mfaktc instances. I'm wondering if I need to get a fast MB\CPU combo or if it's a limit of the PCIe 3.0 bus at this point?
No one got back to me on this. I hope to get a new much faster PC in around a years time. I will do before and after tests for the GPU.
henryzz is offline   Reply With Quote
Old 2015-09-18, 12:02   #31
airsquirrels
 
airsquirrels's Avatar
 
"David"
Jul 2015
Ohio

11·47 Posts
Default

I know mfakto and mfaktc are dealing with different architectures even though they are based off similar code. With that said, PCIe width seems to make a huge difference on fast AMD cards even with GPU sieving on mfakto. I've been looking into why exactly that is but I haven't had much time lately. Is there any reason the number of classes per exponent is set to what it is? In mfakto it looks from the code like we may end up doing a lot of data transfer to/from the card between even GPU sieving and the TF step. Results checking also reads back a bit or so for each k checked which adds up. In my case I'm losing about 30% of my potential capacity due to PCI lane saturation in my hosts.

How similar are the two programs in how they handle scheduling work on the cards?
airsquirrels is offline   Reply With Quote
Old 2015-09-18, 17:32   #32
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11×101 Posts
Default

Hi,

not sure how accurate nvidia-smi measures bandwidth, but:
Code:
# nvidia-smi -l 1 -a | grep Throughput
        Tx Throughput               : 2000 KB/s
        Rx Throughput               : 2000 KB/s
        Tx Throughput               : 24000 KB/s
        Rx Throughput               : 1000 KB/s
        Tx Throughput               : 0 KB/s
        Rx Throughput               : 189000 KB/s
        Tx Throughput               : 0 KB/s
        Rx Throughput               : 0 KB/s
        Tx Throughput               : 0 KB/s
        Rx Throughput               : 0 KB/s
        Tx Throughput               : 0 KB/s
        Rx Throughput               : 0 KB/s
        Tx Throughput               : 0 KB/s
        Rx Throughput               : 0 KB/s
        Tx Throughput               : 0 KB/s
        Rx Throughput               : 0 KB/s
        Tx Throughput               : 0 KB/s
        Rx Throughput               : 0 KB/s
        Tx Throughput               : 0 KB/s
        Rx Throughput               : 0 KB/s
        Tx Throughput               : 0 KB/s
        Rx Throughput               : 0 KB/s
        Tx Throughput               : 0 KB/s
[...]
started at the same time I have started mfaktc on a (factory OCed) GTX 970. Rx/Tx values are shown every second, the first 3 pairs (3 seconds) are during the initial selftest which utilized CPU and GPU sieve enabled kernels. After that it is doing regular work on M73.xxx.xxx with GPU sieving.

Oliver
TheJudger is offline   Reply With Quote
Old 2015-09-18, 17:38   #33
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

100010101112 Posts
Default

Quote:
Originally Posted by airsquirrels View Post
Is there any reason the number of classes per exponent is set to what it is?
yes

There is nothing special about 420/4620, it would work with any other natural number >=1, too, but some numbers allow more efficent sieving than others.

Oliver
TheJudger is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
NVIDIA GeForce GTX 1060 - PCIe 2.0 vs. 3.0 chaoz23 GPU Computing 7 2017-08-03 08:40
(patch) IniWriteFloat should limit its field width Explorer09 Software 0 2015-09-23 01:02
mfaktc on a Mac bayanne GPU Computing 0 2013-10-18 09:59
mfaktc (0.20) fairsky Software 9 2013-09-24 12:58
mfaktc tichy GPU Computing 4 2010-12-03 21:51

All times are UTC. The time now is 11:40.


Tue Jul 27 11:40:02 UTC 2021 up 4 days, 6:09, 0 users, load averages: 1.60, 1.48, 1.57

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.