mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU to 72 (https://www.mersenneforum.org/forumdisplay.php?f=95)
-   -   GPU to 72 status... (https://www.mersenneforum.org/showthread.php?t=16263)

chalsall 2019-02-02 18:42

[QUOTE=chalsall;507356]Unless and until there is a substantial percentage of new kit in the "fleet", it doesn't make much sense to DCTF to 75. Better to LLTF to 77.[/QUOTE]

Having slept on this (I actually sometimes dream in Perl and C; truly)...

I consider myself simply a facilitator, trying to help the GIMPS project by coordinating the Trial Factoring efforts of GPU owners, doing whatever they want to do while avoiding "toe stepping".

If people want to DCTF up to 75 "bits", I can easily make this work available. It only "makes sense" for owners of the newest kit, of which we currently only have two in the "fleet". But it has been observed that many don't really care all that much what is "optimal", and are willing to work a bit (joke not intended) deeper. Further, additional kit is likely to be added over time, particularly after nVidia releases less expensive cards with 7.5 capacity.

We are currently comfortably ahead of the Primenet LL'ing and P-1'ing wavefronts to be able to do this. Importantly, this (probably) won't help find the next MP, but it will (slightly) compress the distance between the DC'ing and LL'ing waves.

My question to you all: do people want to do this work?

If so, I will bring in the candidates to be worked. Further, I will add language to the DCTF'ing assignment page linking over to James' graphs, so people are able to make informed decisions based on their own individual situation.

Thoughts?

Mark Rose 2019-02-02 19:28

[QUOTE=chalsall;507474]Further, I will add language to the DCTF'ing assignment page linking over to James' graphs, so people are able to make informed decisions based on their own individual situation.[/quote]

Excellent.

[quote]Thoughts?[/QUOTE]

I wish there were a way to communicate the hardware I have to GPU72. Maybe the compute version? That would allow GPU72 to be more intelligent about "what makes sense" and "let gpu72 decide" decisions. I'm sure James would be willing to share the raw data behind the graphs.

GPU72 could then release work based on the availability of hardware for higher TF levels.

Or I should probably just switch my 580's to cudaLucas.

chalsall 2019-02-02 19:45

[QUOTE=Mark Rose;507479]I wish there were a way to communicate the hardware I have to GPU72. Maybe the compute version? That would allow GPU72 to be more intelligent about "what makes sense" and "let gpu72 decide" decisions. I'm sure James would be willing to share the raw data behind the graphs.[/QUOTE]

Please trust me on this, I have thought long and hard about that.

The issue is most people use automatic fetching software now-a-days, and it would involve adding an additional field in the HTTP(S) exchange.

Perhaps I should simply add the card compute capacity to the manual assignment form, and hope the authors of the fetching software will update their code as they have time.

At the same time, I would like to be able to add a back-stream field to give notices to the automatic fetching software which is shown to the user. Specifically, "You have X candidates assigned which are soon to be overdue.

Generally people who use MISFIT et al check their assignments via the GPU72 web pages every week or so, so this isn't really a big problem. But if we're going to update the exchange, we'd might as well do everything needed in one go.

James Heinrich 2019-02-02 19:46

[QUOTE=Mark Rose;507479]I'm sure James would be willing to share the raw data behind the graphs.[/QUOTE]Sure. It's all based on GPU GFLOPS divided by a magic constant based on compute level to get GHd/d. A few "fake" Compute level entries are there to account for different behavior of some "pro" vs consumer cards, as documented.

mfaktc:[code]$TF_GFLOPS_per_GHzDayPerDay = array(
'N' => array(
10 => 0.00,
11 => 14.00,
12 => 14.00,
13 => 14.00,
20 => 3.65,
21 => 5.35,
30 => 10.50,
33 => 11.40, // fake entry for 3.5 Tesla (e.g. K20, K40)
34 => 11.40, // fake entry for Titan Black
35 => 11.40,
36 => 11.40, // fake entry for Titan
37 => 11.40,
50 => 9.00,
52 => 9.00,
60 => 8.10, // ??? no benchmarks yet, even GFLOPS numbers are approximate
61 => 8.10,
63 => 8.10, // fake entry for Quadro Pxxx (really Compute 6.1)
70 => 3.58, // Titan V100 -- only one benchmark so far
75 => 3.45, // RTX 20x0 -- only three benchmarks so far
),
'A' => array(
1 => 11.3, // VLIW5
2 => 11.0, // VLIW4
10 => 9.3, // GCN 1.0
11 => 9.3, // GCN 1.1
12 => 9.3, // GCN 1.2
13 => 10.9, // GCN 1.3
15 => 10.0, // GCN 1.5
),
);[/code]
cudalucas:[code]$GFLOPS_FFT_timing_mulitplier = array(
// magicnumber = (fftsize / gflops) / timing
'N' => array(
10 => 0,
11 => 0,
12 => 0,
13 => 230,
20 => 385,
21 => 280,
30 => 165,
33 => 280, // fake entry for 3.5 Tesla (e.g. K20, K40)
34 => 245, // fake entry for Titan Black
35 => 165,
36 => 300, // fake entry for Titan
37 => 235,
50 => 125,
52 => 120,
60 => 325, // ??? only one benchmark for Tesla P100 and it's a significant deviation compared to GTX*
61 => 140, // GTX 10__
63 => 115, // fake entry for Quadro Pxxx (really Compute 6.1)
70 => 265, // Tesla V100 / Titan V
75 => 130, // RTX 2080 etc
),
'A' => array(
1 => 0, // VLIW5
2 => 0, // VLIW4
10 => 160, // GCN 1.0
11 => 140, // GCN 1.1
12 => 105, // GCN 1.2
13 => 125, // GCN 1.3
15 => 125, // GCN 1.5
),
);[/code]

Not sure if it's any use or not, but the data is herewith supplied.

Mark Rose 2019-02-02 21:05

[QUOTE=chalsall;507480]Please trust me on this, I have thought long and hard about that.

The issue is most people use automatic fetching software now-a-days, and it would involve adding an additional field in the HTTP(S) exchange.

Perhaps I should simply add the card compute capacity to the manual assignment form, and hope the authors of the fetching software will update their code as they have time.[/quote]

I can update the mfloop.py script.

[quote]At the same time, I would like to be able to add a back-stream field to give notices to the automatic fetching software which is shown to the user. Specifically, "You have X candidates assigned which are soon to be overdue.

Generally people who use MISFIT et al check their assignments via the GPU72 web pages every week or so, so this isn't really a big problem. But if we're going to update the exchange, we'd might as well do everything needed in one go.[/QUOTE]

A good idea.



Though now that I see James has "fake" compute levels, we need to decide how to handle those. I'm not opposed to using the same values and documenting them in mfloop. I wonder how the author of Misfit feels?

nomead 2019-02-02 21:44

[QUOTE=James Heinrich;507481]Sure. It's all based on GPU GFLOPS divided by a magic constant based on compute level to get GHd/d. A few "fake" Compute level entries are there to account for different behavior of some "pro" vs consumer cards, as documented.

mfaktc:[code]75 => 3.45, // RTX 20x0 -- only three benchmarks so far
[/code]
[/QUOTE]

What would a proper benchmark be, then?

For example, the RTX 2080 entry has 2548,6 GHz-days/day. This is quite close to what I get on my card at the base clock speed of 1515 MHz (M92257213, 72 to 73 bits, 5 min 49,99 sec = 2559 GHz-d/d, GPUSieveSize=128 GPUSieveProcessSize=16) but then it will only consume 114 watts at that frequency while running mfaktc. Letting the card run at its TDP limit and the same settings in mfaktc will result in 1830 MHz, which is 3044 GHz-d/d (4 min 54,26 sec). Better absolute performance but much worse perf per watt, of course. And the frequency achieved at 215W will vary quite a bit, depending on how well the card is cooled, and ultimately luck on the silicon lottery.

I could submit several timings through the submission form, for different clock speeds, and even for GPUSieveSize values larger than 128 (this seems to improve performance significantly), but would this be of any value to you? And soon also on an RTX 2060.

Also, the submission form seems to want a GPU-Z screenshot. This is a bit difficult to do on Linux :smile:

chalsall 2019-02-02 22:41

[QUOTE=nomead;507494]Also, the submission form seems to want a GPU-Z screenshot. This is a bit difficult to do on Linux :smile:[/QUOTE]

Just to share, James and I have had a friendly argument for years about what the "P" stands for in a LAMP stack.

Don't even get us started with regards to a WIMP stack.... :smile:

James Heinrich 2019-02-02 22:45

Submitted benchmarks vary widely, so the magic numbers I've come up with are an estimated appropriate average. Note that all expected performance numbers are at nominal stock clock speeds, which in the last few years bears little resemblance to reality, where the clock speed is always boosted to some degree or other, so you'll need to scale the posted numbers by the ratio of your running clockspeed to the listed stock clock.

GPU-Z screenshots are nice if available to confirm what speed the GPU was actually running at during the test, but are not required (as long as you provide actual clockspeeds in the comments).

Mark Rose 2019-02-02 22:59

[QUOTE=chalsall;507500]Just to share, James and I have had a friendly argument for years about what the "P" stands for in a LAMP stack.

Don't even get us started with regards to a WIMP stack.... :smile:[/QUOTE]

Pascal, right? Make Borland proud!

chalsall 2019-02-02 23:06

[QUOTE=James Heinrich;507501]GPU-Z screenshots are nice if available to confirm what speed the GPU was actually running at during the test, but are not required (as long as you provide actual clockspeeds in the comments).[/QUOTE]

Under Linux, the nvidia-smi command is always available, even for unprivileged accounts:[CODE][chalsall@backuppc mfaktc]$ nvidia-smi
Sat Feb 2 18:59:58 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.59 Driver Version: 390.59 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1050 Off | 00000000:01:00.0 Off | N/A |
| 48% 82C P0 N/A / 65W | 67MiB / 2000MiB | 99% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 1050 Off | 00000000:03:00.0 Off | N/A |
| 63% 70C P0 N/A / 75W | 67MiB / 2000MiB | 99% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 2635 C ./mfaktc.exe 57MiB |
| 1 2663 C ./mfaktc.exe 57MiB |
+-----------------------------------------------------------------------------+[/CODE]

Perhaps you're after more specific data?

chalsall 2019-02-02 23:24

[QUOTE=Mark Rose;507502]Pascal, right? Make Borland proud![/QUOTE]

LOL! I once had a business partner who insisted in coding in Modula 2, because "it was the future".

The rule for the company was people were allowed to code in whatever language they wanted, so long as they produced. Most of us were writing code in C and 680x0 assembly.

The dude couldn't code his way out a wet paper bag....


All times are UTC. The time now is 23:11.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.