mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2020-01-16, 15:07   #1
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

10111110011012 Posts
Default Best Practices

Ernst Mayer recently posted in multiple threads, the idea of having reachable within 2 clicks of the first post in a hardware or software specific how-to thread, best practices for the given application.

But what are the best practices?
This thread is for discussion of that question.

There are practices that are generally applicable, and others that will be application specific at least in the details.
kriesel is offline   Reply With Quote
Old 2020-01-16, 15:08   #2
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

17CD16 Posts
Default A first draft of suggested general best practices

In general, what would constitute best practices for GIMPS effort? My draft proposal:
  1. Use the most efficient software for the task and hardware (example: gpuowl not cllucas for AMD gpu primality testing)
  2. Select the most efficient hardware for the task (examples: use an RTX20xx for TF, use a good cpu or Radeon VII for PRP or P-1; use gpus with relatively greater single precision performance compared to double precision for TF, those with relatively greater double precision performance compared to single precision for PRP, LL or P-1 factoring. Don't use cpus for TF, since gpus are so much more effective at it.)
  3. Use a very recent version of the chosen software.
  4. Use the most effective settings for the given software (examples: PRP not LL first tests; optimal throughput by benchmarking prime95/mprime for throughput versus number of cores/worker versus various fft lengths, analyze, and reconfigure when appropriate)
  5. Use judiciously chosen inputs, for reasonable run time and feasibility of completing the task accurately. A run that takes years is not only likely to expire before completion, it is unlikely to complete accurately unless it is protected by the GEC.
  6. Always log the runs. Some applications have logging built in. Others will need tee or redirection.
  7. Tune the application for the specific software version and hardware involved and exponents being run.
  8. Run at least one double-check, and a memory test, to test the reliability of the hardware & software combination, before beginning production running.
  9. Regularly review the logs for errors. Either manually or with an analysis tool.
  10. Repeat double-check or self-test and memory test at least annually. Hardware reliability changes over time.
  11. Retune if substantially changing the exponents being run, or when a new version of the software is deployed.
  12. Reserve assignments first. Don't poach the assignments of others.
  13. Select work types and assignments appropriately to the capabilities of the hardware and software, so that assignments complete in a reasonable amount of time. (In most cases that will be under two months.)
  14. Contribute at least about 1/5 of your primality testing effort as DC.
  15. Prioritize advancing the GIMPS wavefronts of TF, P-1, first primality testing, and double-checking. This is the most effective at advancing the state of knowledge about Mersenne primes.
  16. When P-1 factoring, use the full PrimeNet bounds given for the exponent at mersenne.ca when possible. (That is the most effective strategy while exponents are being both primality tested and verified, as they currently are.)
  17. What else?
kriesel is offline   Reply With Quote
Old 2020-01-16, 15:15   #3
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

32×677 Posts
Default Best apps to use

My opinion:

TF: mfaktc on NVIDIA, mfakto on AMD or Intel IGP (Mfactor or Factor5 only for exponents beyond the reach of gpu apps)

P-1: mprime/prime95 on cpu, Gpuowl V6.11 if it will run on the gpu, CUDAPm1 v0.20 on NVIDIA gpus within narrow limits if they can't run Gpuowl

primality testing: mprime, prime95, mlucas, gpuowl (CUDALucas v2.06 on NVIDIA gpus that can't run gpuowl, or when specifically running LL DC on NVIDIA gpu)

Last fiddled with by kriesel on 2020-01-16 at 16:13
kriesel is offline   Reply With Quote
Old 2020-01-16, 18:36   #4
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

33·379 Posts
Default

Why not make a table? I have attached an idea. Across the top is the hardware type, down the side is the test type. The software available for that combination are named and linked to the MF thread. Bolded ones are the recommended use for that hardware. Greyed or strikethroughs are uses advised against. Plain are ok uses, but not the best.

All of that can be done in a Code box.
Attached Thumbnails
Click image for larger version

Name:	chart.jpg
Views:	237
Size:	71.9 KB
ID:	21624  
Uncwilly is online now   Reply With Quote
Old 2020-01-16, 20:55   #5
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

13·29·31 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
Why not make a table? I have attached an idea. Across the top is the hardware type, down the side is the test type. The software available for that combination are named and linked to the MF thread. Bolded ones are the recommended use for that hardware. Greyed or strikethroughs are uses advised against. Plain are ok uses, but not the best.

All of that can be done in a Code box.
Code box is problematic due to lack of clickable links - as Ken noted, I want users to be able to view Post #1 in some master thread (likely the one mentioned below), quickly see where to get code/instructions for their target platform and preferred worktype, and 1-click to get there. I had the idea of taking the table at top of Ken's current Mersenne prime hunting software PDF and inlining it, along with relevant links, in Post #1 of the thread housing it. Problem - Mike has disabled HTML-style table markups in the forum due to security concerns. So I suggest a "linearized table" consisting of worktype-based main categories, each of which is followed by a set of brief which-program-to-use-for-platform-X entries, each with an embedded link. Example:

Basic summary of currently relevant GIMPS clients: Once you have used the tables below to figure out which client to use for your desired worktype and platform, click the relevant client link:

1. Primality testing: Note that this includes the recently-introduced PRP (probably-prime) test type, which unlike the traditional rigorous-primality LL test permits a strong form of residue integrity checking, the so-called Gerbicz error check. This is preferable, when available in the relevant client(s) on typical consume hardware lacking ECC memory and fast-but-fault-prone hardware such as GPUs. Any user discovering a likely-prime via PRP testing which is confirmed by the standard subsequent LL-test verification runs will get the same discovery credit as an LL-test user would.

o x86 (Intel and AMD) CPUs: Prime95/mprime (current version: 29.8b6): George Woltman's famous Mersenne-prime-hunting program: Prime95 is the Windows client, mprime the Linux. Does primality testing (both the traditional LL test and the more-recently added PRP-test with Gerbicz error check), Trial Factoring (but use of GPU clients now recommended for that worktype), p-1 and ECM factoring.

(Users who wish to run on x86 in non-networked mode due to security concerns can use Mlucas, which needs no network connection, but is not as efficient on x86 as Prime95/mprime.)

o ARM-based and other non-x86 CPUs: Mlucas (current version: 19.0; dedicated subforum here): Ernst Mayer's program: Can be built under Windows via built-in Linux shell, but is *nix oriented. Supports both LL-testing and PRP-test with Gerbicz-check. Has optimized assembly code for 128-bit ARMv8 SIMD instructions and also for x86 SIMD (128,256 and 512-bit versions) but as noted Prime95/mprime are more efficient on the latter. Also supports a generic-C build mode for platforms lacking vector arithmetic support or ones with SIMD but not of the ARMv8/x86 variety.

o nVidia GPUs: [description-of/links-to CuLu and OpenCL-built GpuOwl]

o AMD GPUs: [description-of/links-to GpuOwl]

o [Other clients folks may be using]

2. Trial Factoring
...


3. p-1 Factoring
...

We can probably omit a separate category for ECM factoring, since AFAIK Prime95 is the only GIMPS client supporting it. Or include it, with description-of/links-to Prime95 and GMP-ECM.
ewmayer is offline   Reply With Quote
Old 2020-01-16, 21:33   #6
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

33×379 Posts
Default

Quote:
Originally Posted by ewmayer View Post
Code box is problematic due to lack of clickable links
Horse hockey
Code:
		Hardware 1	Hardware 2	Hardware 3
Primality	google		Here		None
TF		GPUOwl		None		None

Last fiddled with by Uncwilly on 2020-01-16 at 22:34 Reason: Tested a stricken URL and such
Uncwilly is online now   Reply With Quote
Old 2020-01-16, 21:46   #7
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

13·29·31 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
Horse hockey
Code:
		Hardware 1	Hardware 2	Hardware 3
Primality	google		Here		None
TF		GPUOwl		None		None
I stand corrected. :) I must've confused this forum with some other one I've posted to in the past. So we start with a 2-D table which fits in a single browser frame, and we can add the longer descriptions below that.

Technical question: What is the difference between Horse hockey and Bull pucky? The preferred fodder, quality of the meat, what?

Last fiddled with by ewmayer on 2020-01-16 at 21:48
ewmayer is offline   Reply With Quote
Old 2020-01-16, 21:56   #8
Dylan14
 
Dylan14's Avatar
 
"Dylan"
Mar 2017

22×3×72 Posts
Default

Quote:
Originally Posted by kriesel View Post
My opinion:

TF: mfaktc on NVIDIA, mfakto on AMD or Intel IGP (Mfactor or Factor5 only for exponents beyond the reach of gpu apps)

P-1: mprime/prime95 on cpu, Gpuowl V6.11 if it will run on the gpu, CUDAPm1 v0.20 on NVIDIA gpus within narrow limits if they can't run Gpuowl

primality testing: mprime, prime95, mlucas, gpuowl (CUDALucas v2.06 on NVIDIA gpus that can't run gpuowl, or when specifically running LL DC on NVIDIA gpu)
Shouldn't the bolded bit be CUDAPm1 v0.22, as that is the newest version of that software?
Dylan14 is offline   Reply With Quote
Old 2020-01-16, 22:23   #9
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

13·29·31 Posts
Default

Quote:
Originally Posted by Dylan14 View Post
Shouldn't the bolded bit be CUDAPm1 v0.22, as that is the newest version of that software?
Whatever non-PDF-form table gets created will have to be continually updated ... we're discussing general-layout issues at present.
ewmayer is offline   Reply With Quote
Old 2020-01-17, 00:55   #10
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

32×677 Posts
Default

Quote:
Originally Posted by Dylan14 View Post
Shouldn't the bolded bit be CUDAPm1 v0.22, as that is the newest version of that software?
In my opinion, no, because it introduced more severity of issues than it resolved. Like drivers and occasionally operating systems or automobiles, newest is not always the best.

Last fiddled with by kriesel on 2020-01-17 at 00:56
kriesel is offline   Reply With Quote
Old 2020-01-17, 05:32   #11
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

10111110011012 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
Why not make a table? I have attached an idea.
Intel IGP / primality testing or P-1 probably = "NA". Current gpuowl PRP or P-1 has AMD and NVIDIA code paths, no Intel code path. There's certainly no CUDALucas to do LL or CUDAPm1 to do P-1 on an OpenCL nonCUDA device. And CUDALucas won't do PRP.
Result of attempting a Gpuowl 6.5 run on an Intel UHD630 IGB:
Code:
2020-01-16 19:16:58 Note: no config.txt file found
2020-01-16 19:16:58 config: -device 0 -fft +0 -carry long -use ORIG_X2 
2020-01-16 19:16:58 87398387 FFT 5120K: Width 256x4, Height 64x4, Middle 10; 16.67 bits/word
2020-01-16 19:16:58 using long carry kernels
2020-01-16 19:16:58 OpenCL args "-DEXP=87398387u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=10u -DFRAC=12357831637820925542ul -DWEIGHT_STEP=0xa.0e81d99e13ac8p-3 -DIWEIGHT_STEP=0xc.ba55dbe3e5aep-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DINVWEIGHT_LIMIT=0xc.cccccccccccdp-29 -DORIG_X2=1  -I. -cl-fast-relaxed-math -cl-std=CL2.0"
2020-01-16 19:17:32 OpenCL compilation in 33528 ms
2020-01-16 19:17:36 87398387.owl loaded: k 87000000, block 1000, res64 d2d69bc89926f0a4
2020-01-16 19:20:47 87398387 EE loaded: 87000000, blockSize 1000, c89b639632165de5 (expected d2d69bc89926f0a4)
2020-01-16 19:20:47 Exiting because "error on load"
 2020-01-16 19:20:47 Bye
TF boundary between mfaktx and Mfactor/Factor 5 is not 1B (109), it's 232. The mersenne.org/mersenne.ca boundary is 109. DC versus first-test primality is irrelevant to selecting software. Amazon, Google GCE, Colab, and Kaggle all typically involve linux apps. I'm unpersuaded that each environment needs its own column to repeat the same app names, any more than the various flavors of Windows or distros of linux do.
Attached Thumbnails
Click image for larger version

Name:	no NVIDIA CUDA on intel igps.png
Views:	217
Size:	118.2 KB
ID:	21629  

Last fiddled with by kriesel on 2020-01-17 at 05:46
kriesel is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Firewalling best practices... chalsall Software 9 2019-12-11 22:46
Torture test best practices Darin Information & Answers 7 2012-08-02 11:02
Best practices in addition chains SPWorley Programming 10 2009-07-28 13:50

All times are UTC. The time now is 21:58.


Tue Jan 18 21:58:32 UTC 2022 up 179 days, 16:27, 1 user, load averages: 1.55, 1.55, 1.66

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔