mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > PrimeNet > GPU to 72

Reply
Thread Tools
Old 2019-02-02, 18:42   #4269
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

9,767 Posts
Default

Quote:
Originally Posted by chalsall View Post
Unless and until there is a substantial percentage of new kit in the "fleet", it doesn't make much sense to DCTF to 75. Better to LLTF to 77.
Having slept on this (I actually sometimes dream in Perl and C; truly)...

I consider myself simply a facilitator, trying to help the GIMPS project by coordinating the Trial Factoring efforts of GPU owners, doing whatever they want to do while avoiding "toe stepping".

If people want to DCTF up to 75 "bits", I can easily make this work available. It only "makes sense" for owners of the newest kit, of which we currently only have two in the "fleet". But it has been observed that many don't really care all that much what is "optimal", and are willing to work a bit (joke not intended) deeper. Further, additional kit is likely to be added over time, particularly after nVidia releases less expensive cards with 7.5 capacity.

We are currently comfortably ahead of the Primenet LL'ing and P-1'ing wavefronts to be able to do this. Importantly, this (probably) won't help find the next MP, but it will (slightly) compress the distance between the DC'ing and LL'ing waves.

My question to you all: do people want to do this work?

If so, I will bring in the candidates to be worked. Further, I will add language to the DCTF'ing assignment page linking over to James' graphs, so people are able to make informed decisions based on their own individual situation.

Thoughts?
chalsall is offline   Reply With Quote
Old 2019-02-02, 19:28   #4270
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

2·5·293 Posts
Default

Quote:
Originally Posted by chalsall View Post
Further, I will add language to the DCTF'ing assignment page linking over to James' graphs, so people are able to make informed decisions based on their own individual situation.
Excellent.

Quote:
Thoughts?
I wish there were a way to communicate the hardware I have to GPU72. Maybe the compute version? That would allow GPU72 to be more intelligent about "what makes sense" and "let gpu72 decide" decisions. I'm sure James would be willing to share the raw data behind the graphs.

GPU72 could then release work based on the availability of hardware for higher TF levels.

Or I should probably just switch my 580's to cudaLucas.
Mark Rose is offline   Reply With Quote
Old 2019-02-02, 19:45   #4271
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

100110001001112 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
I wish there were a way to communicate the hardware I have to GPU72. Maybe the compute version? That would allow GPU72 to be more intelligent about "what makes sense" and "let gpu72 decide" decisions. I'm sure James would be willing to share the raw data behind the graphs.
Please trust me on this, I have thought long and hard about that.

The issue is most people use automatic fetching software now-a-days, and it would involve adding an additional field in the HTTP(S) exchange.

Perhaps I should simply add the card compute capacity to the manual assignment form, and hope the authors of the fetching software will update their code as they have time.

At the same time, I would like to be able to add a back-stream field to give notices to the automatic fetching software which is shown to the user. Specifically, "You have X candidates assigned which are soon to be overdue.

Generally people who use MISFIT et al check their assignments via the GPU72 web pages every week or so, so this isn't really a big problem. But if we're going to update the exchange, we'd might as well do everything needed in one go.

Last fiddled with by chalsall on 2019-02-02 at 19:49 Reason: s/bit problem/big problem/;
chalsall is offline   Reply With Quote
Old 2019-02-02, 19:46   #4272
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

1101010111012 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
I'm sure James would be willing to share the raw data behind the graphs.
Sure. It's all based on GPU GFLOPS divided by a magic constant based on compute level to get GHd/d. A few "fake" Compute level entries are there to account for different behavior of some "pro" vs consumer cards, as documented.

mfaktc:
Code:
$TF_GFLOPS_per_GHzDayPerDay = array(
	'N' => array(
		10 =>  0.00,
		11 => 14.00,
		12 => 14.00,
		13 => 14.00,
		20 =>  3.65,
		21 =>  5.35,
		30 => 10.50,
		33 => 11.40, // fake entry for 3.5 Tesla (e.g. K20, K40)
		34 => 11.40, // fake entry for Titan Black
		35 => 11.40,
		36 => 11.40, // fake entry for Titan
		37 => 11.40,
		50 =>  9.00,
		52 =>  9.00,
		60 =>  8.10, // ??? no benchmarks yet, even GFLOPS numbers are approximate
		61 =>  8.10,
		63 =>  8.10, // fake entry for Quadro Pxxx (really Compute 6.1)
		70 =>  3.58, // Titan V100  -- only one benchmark so far
		75 =>  3.45, // RTX 20x0    -- only three benchmarks so far
	),
	'A' => array(
		 1 => 11.3, // VLIW5
		 2 => 11.0, // VLIW4
		10 =>  9.3, // GCN 1.0
		11 =>  9.3, // GCN 1.1
		12 =>  9.3, // GCN 1.2
		13 => 10.9, // GCN 1.3
		15 => 10.0, // GCN 1.5
	),
);
cudalucas:
Code:
$GFLOPS_FFT_timing_mulitplier = array(
	// magicnumber = (fftsize / gflops) / timing
	'N' => array(
		10 =>   0,
		11 =>   0,
		12 =>   0,
		13 => 230,
		20 => 385,
		21 => 280,
		30 => 165,
		33 => 280, // fake entry for 3.5 Tesla (e.g. K20, K40)
		34 => 245, // fake entry for Titan Black
		35 => 165,
		36 => 300, // fake entry for Titan
		37 => 235,
		50 => 125,
		52 => 120,
		60 => 325, // ??? only one benchmark for Tesla P100 and it's a significant deviation compared to GTX*
		61 => 140, // GTX 10__
		63 => 115, // fake entry for Quadro Pxxx (really Compute 6.1)
		70 => 265, // Tesla V100 / Titan V
		75 => 130, // RTX 2080 etc
	),
	'A' => array(
		 1 => 0,   // VLIW5
		 2 => 0,   // VLIW4
		10 => 160, // GCN 1.0
		11 => 140, // GCN 1.1
		12 => 105, // GCN 1.2
		13 => 125, // GCN 1.3
		15 => 125, // GCN 1.5
	),
);
Not sure if it's any use or not, but the data is herewith supplied.
James Heinrich is offline   Reply With Quote
Old 2019-02-02, 21:05   #4273
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

2×5×293 Posts
Default

Quote:
Originally Posted by chalsall View Post
Please trust me on this, I have thought long and hard about that.

The issue is most people use automatic fetching software now-a-days, and it would involve adding an additional field in the HTTP(S) exchange.

Perhaps I should simply add the card compute capacity to the manual assignment form, and hope the authors of the fetching software will update their code as they have time.
I can update the mfloop.py script.

Quote:
At the same time, I would like to be able to add a back-stream field to give notices to the automatic fetching software which is shown to the user. Specifically, "You have X candidates assigned which are soon to be overdue.

Generally people who use MISFIT et al check their assignments via the GPU72 web pages every week or so, so this isn't really a big problem. But if we're going to update the exchange, we'd might as well do everything needed in one go.
A good idea.



Though now that I see James has "fake" compute levels, we need to decide how to handle those. I'm not opposed to using the same values and documenting them in mfloop. I wonder how the author of Misfit feels?
Mark Rose is offline   Reply With Quote
Old 2019-02-02, 21:44   #4274
nomead
 
nomead's Avatar
 
"Sam Laur"
Dec 2018
Turku, Finland

317 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Sure. It's all based on GPU GFLOPS divided by a magic constant based on compute level to get GHd/d. A few "fake" Compute level entries are there to account for different behavior of some "pro" vs consumer cards, as documented.

mfaktc:
Code:
75 =>  3.45, // RTX 20x0    -- only three benchmarks so far
What would a proper benchmark be, then?

For example, the RTX 2080 entry has 2548,6 GHz-days/day. This is quite close to what I get on my card at the base clock speed of 1515 MHz (M92257213, 72 to 73 bits, 5 min 49,99 sec = 2559 GHz-d/d, GPUSieveSize=128 GPUSieveProcessSize=16) but then it will only consume 114 watts at that frequency while running mfaktc. Letting the card run at its TDP limit and the same settings in mfaktc will result in 1830 MHz, which is 3044 GHz-d/d (4 min 54,26 sec). Better absolute performance but much worse perf per watt, of course. And the frequency achieved at 215W will vary quite a bit, depending on how well the card is cooled, and ultimately luck on the silicon lottery.

I could submit several timings through the submission form, for different clock speeds, and even for GPUSieveSize values larger than 128 (this seems to improve performance significantly), but would this be of any value to you? And soon also on an RTX 2060.

Also, the submission form seems to want a GPU-Z screenshot. This is a bit difficult to do on Linux
nomead is offline   Reply With Quote
Old 2019-02-02, 22:41   #4275
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

9,767 Posts
Default

Quote:
Originally Posted by nomead View Post
Also, the submission form seems to want a GPU-Z screenshot. This is a bit difficult to do on Linux
Just to share, James and I have had a friendly argument for years about what the "P" stands for in a LAMP stack.

Don't even get us started with regards to a WIMP stack....
chalsall is offline   Reply With Quote
Old 2019-02-02, 22:45   #4276
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

11×311 Posts
Default

Submitted benchmarks vary widely, so the magic numbers I've come up with are an estimated appropriate average. Note that all expected performance numbers are at nominal stock clock speeds, which in the last few years bears little resemblance to reality, where the clock speed is always boosted to some degree or other, so you'll need to scale the posted numbers by the ratio of your running clockspeed to the listed stock clock.

GPU-Z screenshots are nice if available to confirm what speed the GPU was actually running at during the test, but are not required (as long as you provide actual clockspeeds in the comments).
James Heinrich is offline   Reply With Quote
Old 2019-02-02, 22:59   #4277
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

2×5×293 Posts
Default

Quote:
Originally Posted by chalsall View Post
Just to share, James and I have had a friendly argument for years about what the "P" stands for in a LAMP stack.

Don't even get us started with regards to a WIMP stack....
Pascal, right? Make Borland proud!
Mark Rose is offline   Reply With Quote
Old 2019-02-02, 23:06   #4278
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

9,767 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
GPU-Z screenshots are nice if available to confirm what speed the GPU was actually running at during the test, but are not required (as long as you provide actual clockspeeds in the comments).
Under Linux, the nvidia-smi command is always available, even for unprivileged accounts:
Code:
[chalsall@backuppc mfaktc]$ nvidia-smi 
Sat Feb  2 18:59:58 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.59                 Driver Version: 390.59                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1050    Off  | 00000000:01:00.0 Off |                  N/A |
| 48%   82C    P0    N/A /  65W |     67MiB /  2000MiB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1050    Off  | 00000000:03:00.0 Off |                  N/A |
| 63%   70C    P0    N/A /  75W |     67MiB /  2000MiB |     99%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      2635      C   ./mfaktc.exe                                  57MiB |
|    1      2663      C   ./mfaktc.exe                                  57MiB |
+-----------------------------------------------------------------------------+
Perhaps you're after more specific data?
chalsall is offline   Reply With Quote
Old 2019-02-02, 23:24   #4279
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

230478 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
Pascal, right? Make Borland proud!
LOL! I once had a business partner who insisted in coding in Modula 2, because "it was the future".

The rule for the company was people were allowed to code in whatever language they wanted, so long as they produced. Most of us were writing code in C and 680x0 assembly.

The dude couldn't code his way out a wet paper bag....
chalsall is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Status Primeinator Operation Billion Digits 5 2011-12-06 02:35
62 bit status 1997rj7 Lone Mersenne Hunters 27 2008-09-29 13:52
OBD Status Uncwilly Operation Billion Digits 22 2005-10-25 14:05
1-2M LLR status paulunderwood 3*2^n-1 Search 2 2005-03-13 17:03
Status of 26.0M - 26.5M 1997rj7 Lone Mersenne Hunters 25 2004-06-18 16:46

All times are UTC. The time now is 09:35.


Mon Aug 2 09:35:04 UTC 2021 up 10 days, 4:04, 0 users, load averages: 0.97, 1.15, 1.27

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.