View Single Post
Old 2018-05-28, 20:08   #1
kriesel's Avatar
Mar 2017
US midwest

32·677 Posts
Default CUDALucas-specific reference material

This thread is intended to hold only reference material specifically for CUDALucas
(Suggestions are welcome. Discussion posts in this thread are not encouraged. Please use the reference material discussion thread. Off-topic posts may be moved or removed, to keep the reference threads clean, tidy, and useful.)

If you're already set up and running in CUDALucas, scroll to the bottom of the post for the thread table of contents. But please reconsider. PRP with GEC and proof is FAR more efficient on the same hardware. This thread will remain, for historical purposes, and the one rare remaining use case.

How to set up and run CUDALucas

First, consider what is your intended use case, and better alternatives. Gpuowl and PRP/GEC/proof is more than triple the speed of CUDALucas LL & LLDC on the same hardware! Gpuowl and PRP are recommended for new first-time primality tests, on GPUs that can run it. It has superior error detection and handling, much lower cost of verification due to proof generation capability, and is also faster, than CUDALucas v2.06, per iteration, on the same hardware. To perform LL DC, on GPUs that can run Gpuowl, Gpuowl is recommended for that too, unless the first test was done with zero shift, since recent Gpuowl versions include the Jacobi check but lacks nonzero shift capability. It's typically faster and more reliable to run a PRP & proof with Gpuowl than an LL DC. Attempting prp first is a reliable way to assess a GPU's reliability. Some older NVIDIA gpus can't run Gpuowl, but can run CUDALucas, which lacks the Jacobi check. Running LL DC on them is tolerated, for now, but do not run first-time tests as LL. Gpus old enough to be incapable of running Gpuowl are very likely too slow and unreliable to run first primality tests. There isn't really a justifiable use case for CUDALucas any more, other than as part of the new prime discovery confirmation runs, after a PRP is found by other software. DO NOT RUN LL AS FIRST PRIMALITY TESTS!

Download from, for Linux
or for Windows

Create a user directory. Unzip the CUDALucas software in it.
Get the appropriate CUDA level cufft and cudart library files for your gpu and OS from and place them in the same directory.

(If all else fails, CUDA dlls or .so files can be obtained by downloading and installing the applicable OS and CUDA version’s CUDA SDK, then locating the files and adding the few needed files to your path or to your working directory on other systems. Download the current version from
Older versions can be obtained from the download archive, at
These are typically gigabytes in the newer versions; 2.8GB @ CUDA v11.3.
Get the installation guide and other documentation pdfs also while you're there.

Note, not all OS version and CUDA version combinations are supported! See CUDA Toolkit compatibility vs CUDA level and good luck!)

Review the cudalucas.ini file. Keep an original version for reference.
Only make changes you're sure of.

Get the cl-startup script below, for Windows, or tdulcet's scripts for Linux from
Edit carefully to adapt to your gpu and environment.
Read and run them. Be patient. Depending on gpu model and other variables, the Windows startup script can take hours or days to complete. On an RTX2080 (which is better suited to TF), a single pass memory test alone takes about 75 minutes. If it crashes with an out of memory error reduce the number of 25MB blocks to just below what it logged as attempting, and try again. (A test on RTX2080 worked with 314 blocks.)
The cl-startup script includes rerunning a small known Mersenne prime, of your choice by editing the file. Do not proceed to new work until it completes that correctly.

If the gpu shows memory errors, you might be able to clear them up by improving cooling or lowering the clock speed. Until it passes a comprehensive memory test, don't use it for primality testing. Retesting gpu memory annually and regularly performing double checks are recommended.
CUDALucas is very vulnerable to memory errors since it has neither the Gerbicz error check nor the Jacobi check. System ram errors or gpu vram errors can cause wrong primality test results.

See also the draft readme update at

It is likely that future discoveries of Mersenne primes will be made with Gpuowl via PRP/GEC, and confirmed with Gpuowl on Radeon VII running LL with Jacobi check. Parallel confirmations are likely in CUDALucas, prime95, and Mlucas on the fastest available reliable hardware.

To obtain LL DC assignments, go to and check at the upper right you're logged in.
Specify number of assignments and workers. (Start small.)
At "Preferred work type:" select "Double-check LL tests".
Click "Get Assignments". The page will update with assignments. (Eventually; be patient. Do not click page refresh unless you want multiple batches and have already copied the previous batch.)
Copy and paste the assignments from the page, into a worktodo.txt file in your CUDALucas working directory. Then launch the CUDALucas program to test the assignments. These can take a long time. Start with the smallest you can get, until you develop a sense of time required. Generally, time required on a given gpu is proportional to p2.1 and is measured in days, weeks, or months. Longer than about a month per assignment is likely to be unreliable on even good equipment. ECC system ram may help reliability. An RTX2080 Super takes about 40 hours for a 53M LL DC, and is much more productive for the project when performing TF with mfaktc.

To report results, go to and check at the upper right you're logged in.
Copy and paste recent (previously unreported) results into the page and click submit.
The page will refresh. Note any error messages, and whether your double-check(s) matched. I usually append a marker in the results.txt file they came from to indicate what's preceding has been reported, then save it.

If the double-check does not match the first test's residue64, it means at least one of them is wrong. A triple check to resolve which can be requested at
On rare occasions quad or higher checks are needed and can be requested there too. You can also help out there with the workload of triple checks. See post one of that thread and the gpu link there.

While the workload of managing the lengthy tests manually is small, includes an attachment describing client management software options, which might be useful if you'd like to try to add some automation to the assignment and result reporting process for your gpu(s).

See also for more information on CUDALucas specifications and limits, development and discussion thread link, etc. the attachment at Available Mersenne Prime hunting software

Table of contents
  1. This post
  2. Run time scaling versus exponent for the NVIDIA GTX480 of CUDALucas v2.06
  3. CUDALucas bug and wish list
  4. Links to prior posts concerning pitfalls, setup, configuration, etc.
  5. Startup scripts
  6. Draft readme update
  7. What limits how big an exponent can be run
  8. What's the best CUDA level to run CUDALucas with my GPU?
  9. CUDALucas V2.06 -h help output
  10. Save file size versus exponent
  11. Etc tbd

Top of reference tree:
Attached Files
File Type: 7z cl-startup.7z (2.0 KB, 225 views)

Last fiddled with by kriesel on 2021-07-21 at 23:18 Reason: minor edits
kriesel is offline