mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2016-09-02, 16:53   #1
jpalo
 
Aug 2016

3 Posts
Default CudaLucas correct config

Hi,

I'm using CUDALucas2.05.1-CUDA8.0-Windows-x64.exe on Win10 x64.
Whit dlls cudart64_80.dll and cufft64_80.dll

My GPU is a GTX 1070

And my ini file:

ErrorIterations=100
ReportIterations=10000
CheckpointIterations=100000
Polite=0
PoliteValue=50
BigCarry=1
ErrorReset=85
Interactive=1
Threads=256 128

Its correct the config?

For FTT 4320K I have 5.5ms/It, Its normal this time?

Code:
|   Date     Time    |   Test Num     Iter        Residue        |    FFT   Error     ms/It     Time  |       ETA      Done   |
|  Sep 02  13:54:32  |  M77261453    190000  0x4b7e12dac484f666  |  4320K  0.08704   5.5467   55.46s  |   4:23:09:15   0.24%  |
|  Sep 02  13:55:28  |  M77261453    200000  0x07986a6005462d36  |  4320K  0.08897   5.5464   55.46s  |   4:23:07:05   0.25%  |
|  Sep 02  13:56:23  |  M77261453    210000  0xad50578e31124946  |  4320K  0.08984   5.5821   55.82s  |   4:23:07:14   0.27%  |
Thx
jpalo is offline   Reply With Quote
Old 2016-09-08, 17:22   #2
jpalo
 
Aug 2016

112 Posts
Default

I will assume that is correct and optimal configuration.
jpalo is offline   Reply With Quote
Old 2016-09-08, 18:14   #3
GP2
 
GP2's Avatar
 
Sep 2003

2·5·7·37 Posts
Default

Quote:
Originally Posted by jpalo View Post
Code:
|   Date     Time    |   Test Num     Iter        Residue        |    FFT   Error     ms/It     Time  |       ETA      Done   |
|  Sep 02  13:54:32  |  M77261453    190000  0x4b7e12dac484f666  |  4320K  0.08704   5.5467   55.46s  |   4:23:09:15   0.24%  |
You show the residue at iteration 190,000. Did you record the residue value at iteration=10,000?

I ran this exponent using mprime with InterimResidues=10000 in prime.txt and got a residue of 0x62D64892C0BFD302 at iteration 10,000. I wasn't willing to let it run beyond that.

If your residue value matches, then you probably got the basic configuration right.
GP2 is offline   Reply With Quote
Old 2017-08-03, 19:30   #4
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

24·3·163 Posts
Default

Quote:
Originally Posted by jpalo View Post
Hi,

I'm using CUDALucas2.05.1-CUDA8.0-Windows-x64.exe on Win10 x64.
Whit dlls cudart64_80.dll and cufft64_80.dll

My GPU is a GTX 1070

And my ini file:

ErrorIterations=100
ReportIterations=10000
CheckpointIterations=100000
Polite=0
PoliteValue=50
BigCarry=1
ErrorReset=85
Interactive=1
Threads=256 128

Its correct the config?

For FTT 4320K I have 5.5ms/It, Its normal this time?

Code:
|   Date     Time    |   Test Num     Iter        Residue        |    FFT   Error     ms/It     Time  |       ETA      Done   |
|  Sep 02  13:54:32  |  M77261453    190000  0x4b7e12dac484f666  |  4320K  0.08704   5.5467   55.46s  |   4:23:09:15   0.24%  |
|  Sep 02  13:55:28  |  M77261453    200000  0x07986a6005462d36  |  4320K  0.08897   5.5464   55.46s  |   4:23:07:05   0.25%  |
|  Sep 02  13:56:23  |  M77261453    210000  0xad50578e31124946  |  4320K  0.08984   5.5821   55.82s  |   4:23:07:14   0.27%  |
Thx
ReportIterations=10000 is a little on the busy side. It would be a little faster without all those printf's. On a card that fast, dial back to 100000 for ReportIterations and longer for saving checkpoint files. Prime95 saves at half hour intervals, so that the average loss would be a quarter hour's work.
kriesel is online now   Reply With Quote
Old 2017-08-03, 22:56   #5
GP2
 
GP2's Avatar
 
Sep 2003

2·5·7·37 Posts
Default

Quote:
Originally Posted by kriesel View Post
ReportIterations=10000 is a little on the busy side. It would be a little faster without all those printf's.
From the timestamps, a printf happens every 55 or 56 seconds. Surely the overhead is entirely negligible, no?
GP2 is offline   Reply With Quote
Old 2017-08-04, 00:42   #6
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

2×3×1,693 Posts
Default

Quote:
Originally Posted by GP2 View Post
From the timestamps, a printf happens every 55 or 56 seconds. Surely the overhead is entirely negligible, no?
I use 10000 on the 460 card, and 20000 on the 1060. This makes them report at about the same interval. You don't have to stick with 10x parameters. For practical purposes you will still end up with 4 or 5 0's on the end, though.
kladner is offline   Reply With Quote
Old 2017-08-04, 04:14   #7
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
Not U. + S.A.

22·5·11·13 Posts
Default

Quote:
Originally Posted by kriesel View Post
ReportIterations=10000 is a little on the busy side. It would be a little faster without all those printf's.
If I look at the screen while running CuLu and I don't see movement for more than a few seconds, then It seems like something is wrong. I keep the value low. Does it really add-in to the total run time that much?
storm5510 is online now   Reply With Quote
Old 2017-08-04, 16:17   #8
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

24×3×163 Posts
Default

Quote:
Originally Posted by storm5510 View Post
If I look at the screen while running CuLu and I don't see movement for more than a few seconds, then It seems like something is wrong. I keep the value low. Does it really add-in to the total run time that much?
The CPU hit for screen output is probably small. The fraction of the time you're looking at the screen is also probably small, so the rest of the time the frequent updates don't help. Over GPU-years a small overhead will slowly add up. To me, frequent updating for something that runs for days or weeks seems excessively verbose. Tastes differ. I also find the longer I run it the more comfortable I become with less interim screen output. Frequent updates are useful while figuring out if it is running correctly at the outset. Then I adjust it to less frequent.

The time to write frequent checkpoint files is probably a bigger drag on GPU performance, as it's several megabytes each time for current wavefronts, increasing with exponent, and somewhat larger for CUDAPm1 than CUDALucas. (Mfaktc checkpoints are tiny; just spotted one at 42 bytes.) Not having a development environment installed, I haven't put a profiler on it to see how much or little either checkpointing or screen output amounts to. Whatever it is it's more than zero, and I want all practical throughput going to useful results.

I redirect all screen output to a file also. Opening big files is slow, when I want to look at the current state, such as to check whether the residues and iteration times are normal, and when an exponent will complete, so I aim for several minutes between status lines. Without redirection, the more frequent the screen updates, the easier it is to miss an error message because they've scrolled up out of sight and possibly out of the buffer. Having lines that represent at least several minutes, is still over a very small percentage of completion of even a wavefront doublecheck, which each take days on anything I currently have set up.

I'd also like it better if Mfaktc didn't output as many lines as it does per factoring bit level, by about a tenfold reduction or more. It's generating sometimes over 30 or even 70 screen lines a minute, and accumulating about 22MB of screen output per week. But hey, CuLu is configurable, so each user can do what he likes in that regard.
kriesel is online now   Reply With Quote
Old 2017-08-06, 15:35   #9
chris2be8
 
chris2be8's Avatar
 
Sep 2009

25×7×11 Posts
Default

From experience several years ago I found that screen output doesn't use a noticeable (>10%) amount of CPU time unless you are outputting thousands of lines per second. So one line only takes a few microseconds of CPU time to output.

A more reasonable concern is if you want to scroll back to see what it was doing a while ago, not having so much output to search would make life easier.

Chris
chris2be8 is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
how to config prime95 for my PC...plz help Flexagon Information & Answers 7 2018-03-21 11:05
prime95 config file changes bgbeuning Software 8 2018-02-11 16:58
Config file changes to consolidate workers - moving to a 22-core config NookieN Hardware 7 2017-08-10 17:57
Can an OS config be OK after RAM failed? RickC Hardware 8 2010-10-28 03:31
Startup config for Quad Core on Linux dswanson Hardware 4 2008-01-28 16:29

All times are UTC. The time now is 15:00.


Fri Jul 7 15:00:14 UTC 2023 up 323 days, 12:28, 0 users, load averages: 1.54, 1.21, 1.15

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔