![]() |
![]() |
#2795 |
"Ghetto_Child"
Jul 2014
Montreal, QC, Canada
41 Posts |
![]()
I changed GPU clock/power settings during a test and corrupted the results. I didn't backup the work files prior so on stopping the program some of the corruption saved. I did save the screen output with the last 3-5 good residue results. How can I restart the test from close to the good residue output? The only save file I have is right when the corruption started so it always results in suspicious identical residues until eventually an illegal residue error.
|
![]() |
![]() |
![]() |
#2796 | |
P90 years forever!
Aug 2002
Yeehaw, FL
11100011111102 Posts |
![]() Quote:
This might be a good time to look into gpuowl. It is virtually immune to hardware errors. It does PRP tests instead of LL tests so it is not good for double-checking. |
|
![]() |
![]() |
![]() |
#2797 | |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
113538 Posts |
![]() Quote:
Code:
# SaveAllCheckpoints is the same as the -s option. When active, CUDALucas will # save each checkpoint separately in the folder specified in the "SaveFolder" # option. This is a binary option; set to 1 to activate, 0 to de-activate. SaveAllCheckpoints=1 # This option is the name of the folder where the separate checkpoint files are # saved. This option is only checked if SaveAllCheckpoints is activated. SaveFolder=savefiles If you don't have savefiles, all you have are cx and tx files for the exponent x, and anything you can find in system backups or the recycle bin or manually made copies from before a change. C is the most recent, t is preceding. It sounds like you've already gone through the drill of copying the c and t files, then attempting to resume. Sometimes the t file is good and the c file needs to be removed or renamed out of the way. Sometimes both are bad before a problem is found; that situation is what saveallcheckpoints is for. The down side of saving all checkpoints is it fills a lot of disk space over time. CUDALucas is good but does not have even the Jacobi check with its 50% probability of detecting an error. CUDALucas will run on even old NVIDIA gpus with CUDA compute capability as low as 2.0. GpuOwl PRP3 has the superior Gerbicz check which has almost 100% error detection and rollback / resume from last known good state. While GpuOwl was originally developed for AMD gpus, V6.5 and later will run on some NVIDIA gpus, but not the older ones. (CUDA compute 2.x, 3.0 fail to run gpuowl in my experience.) GpuOwl V6.5 keeps checkpoints at every 20M iterations. GpuOwl was switched at V6.8 to not saving checkpoint files every 20M iterations, so it now keeps only x.owl and x-prev.owl, analogous to CUDALucas cx and tx. https://www.mersenneforum.org/showpo...83&postcount=7 Last fiddled with by kriesel on 2019-10-05 at 13:33 |
|
![]() |
![]() |
![]() |
#2798 |
Einyen
Dec 2003
Denmark
300710 Posts |
![]()
I got a Tesla P100 on Google Colab and I compiled CUDALucas again. I can run -cufftbench without problems.
But if I run -threadbench with any range or -r 0 or -r 1 or just starts CUDALucas on an exponent I get: *** buffer overflow detected *** -threadbench runs all the way through the test and fails at the very end without creating the *threads*.txt file. It is compiled for the Compute Capability 6.0 which it uses. Last fiddled with by ATH on 2019-11-14 at 13:33 |
![]() |
![]() |
![]() |
#2799 |
"William Garnett III"
Oct 2002
Bensalem, PA
2×43 Posts |
![]()
The CUDA10.1 and CUDA9.2 -Windows-x64.exe versions of CUDALucas2.06 (with respective libraries) from your official sourceforge link listed at the bottom below run slower on my GPU than the CUDA8.0 version with respective libraries.
For the 57593359 exponent I am manually testing for mersenne.org the CUDA8.0 version is about 10.7ms per iteration while the CUDA9.2 AND CUDA10.1 versions are about 12.3ms per iteration. (all are 3136 FFT with same CUDALucas.ini file) The previous version of 2.05.1_CUDA8.0 CUDALucas (with respective libraries) I used to use from your website before yesterday has the same per iteration time of 10.7 as the 2.06_CUDA8.0 version so something is up with the CUDA9.2 and CUDA10.1 versions of 2.06 CUDALucas on my setup.. The info about my GPU setup is listed below. Can someone tell me why there is a significant slow down with these newer versions? EVGA GeForce GTX 1050 SC GAMING (2GB GDDR5) Part number: 02G-P4-6152-KR Dell Desktop Tower with Windows 10 Intel i3-4150 @ 3.5GHz Memory: 8.00 GB http://sourceforge.net/projects/cudalucas/files/ |
![]() |
![]() |
![]() |
#2800 | |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
113538 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#2801 |
Mar 2020
916 Posts |
![]()
I am compiling CUDALucas 2.06 from the sourceforge (https://sourceforge.net/projects/cudalucas/files/). After ran makefile, I used $./CUDALucas -r 1 to test it's reliable or not.
Unfortunately, I got all residue [0000000000000]. My OS is Ubuntu 18.04 and I have already changed CUDA path in makefile and also --generate-code arch=compute_60, code=sm_60. Message showed on the top of ./CUDALucas: binary compiled for CUDA 10.20 CUDA runtime version 10.20 CUDA driver version 10.20 GPU type is Tesla V100-PCIE and driver version is 440.33.01. I have read all related posts to my question but none of them can solve my problem. What's more, I set worktodo.txt as "Test=79437629" and got "Illegal residue: 0x0000000000000000. See mersenneforum.org for help.". Thanks in advance for replies and sorry if I posted in wrong place. ps.I changed CUDA version to 9.1 and used Linux pre-compiled CUDALucas from https://download.mersenne.ca/CUDALucas/old. The output seems good. No 0 residue appears in the middle of the looping. But I am still trying to figure out my compilation problem. Open to any advice! |
![]() |
![]() |
![]() |
#2802 | |
Mar 2020
32 Posts |
![]() Quote:
Finally, I compiled an executive file on Tesla M6 with CUDA 10.20. More details in this thread https://www.mersenneforum.org/showth...418#post540418 . |
|
![]() |
![]() |
![]() |
#2803 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
10010111010112 Posts |
![]()
Given a cudalucas save file or several for a very large exponent, each at a round number of many millions of iterations, that presumably does not have the res64 encoded into the file name, and no corresponding log file, is there a straightforward way of obtaining the res64s from the save files, at the round numbers of iterations? (Asking for a friend who's trying to do me a favor, but probably does not want to run millions of iterations again to get to the next round numbers.) Opening some much smaller exponents' interim save files in a text editor, there's nothing human readable there, no ASCII header record or footer.
Looking at old logs, I see CUDALucas does not output the stored res64 of a save file when resumed. (Gpuowl does that, which is a nice feature.) Looking at old source code, I see in CUDALucas.cu, Code:
void write_checkpoint(unsigned *x_packed, int q, unsigned long long residue) { FILE *fPtr; char chkpnt_cfn[32]; char chkpnt_tfn[32]; int end = (q + 31) / 32; sprintf (chkpnt_cfn, "c%d", q); sprintf (chkpnt_tfn, "t%d", q); (void) unlink (chkpnt_tfn); (void) rename (chkpnt_cfn, chkpnt_tfn); fPtr = fopen (chkpnt_cfn, "wb"); if (!fPtr) { fprintf(stderr, "Couldn't write checkpoint.\n"); return; } x_packed[end + 8] = magic_number(x_packed, q); x_packed[end + 9] = checkpoint_checksum((char*) x_packed, 4 * (end + 9)); fwrite (x_packed, 1, sizeof (unsigned) * (end + 10), fPtr); fclose (fPtr); if (g_sf > 0) // save all checkpoint files { char chkpnt_sfn[64]; char test[64]; #ifndef _MSC_VER sprintf (chkpnt_sfn, "%s/s" "%d.%d.%016llx", g_folder, q, x_packed[end + 2] - 1, residue); sprintf (test, "%s/%s", g_folder, ".empty.txt"); #else sprintf (chkpnt_sfn, "%s\\s" "%d.%d.%016llx.cls", g_folder, q, x_packed[end + 2] - 1, residue); sprintf (test, "%s\\%s", g_folder, ".empty.txt"); #endif fPtr = NULL; fPtr = fopen (test, "r"); if(!fPtr) { #ifndef _MSC_VER mode_t mode = S_IRWXU | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH; if (mkdir (g_folder, mode) != 0) fprintf (stderr, "mkdir: cannot create directory `%s': File exists\n", g_folder); #else if (_mkdir (g_folder) != 0) fprintf (stderr, "mkdir: cannot create directory `%s': File exists\n", g_folder); #endif fPtr = fopen(test, "w"); if(fPtr) fclose(fPtr); } else fclose(fPtr); fPtr = fopen (chkpnt_sfn, "wb"); if (!fPtr) return; fwrite (x_packed, 1, sizeof (unsigned) * (((q + 31) / 32) + 10), fPtr); fclose (fPtr); } } edit: oops, no, I think that's stream print to produce the s<exponent.<iteration>.<res64expressedinhex>.cls filename for storage in the savefiles subdirectory. Need to dig further for regular checkpoint files and file contents, and makeup of x_packed. Last fiddled with by kriesel on 2020-05-26 at 13:43 |
![]() |
![]() |
![]() |
#2804 |
Romulan Interpreter
Jun 2011
Thailand
24×571 Posts |
![]()
Yes. Rename it cXXX blah blah and put it in culu folder and run culu on it for 20 seconds with checkpoint set to 2k or so. It will produce a proper named checkpoint file in seconds. Why do you always go the most complicate path to crack it?
![]() Digging out the res64 with a hex editor is shift-dependent. Two files with different shift have different content. But there was a tool to extract the res64 from a file, I used it in the past. Last fiddled with by LaurV on 2020-05-28 at 17:12 |
![]() |
![]() |
![]() |
#2805 | |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
29·167 Posts |
![]() Quote:
Running them from their current state, a multiple of 10M iterations, would be nontrivial if I had them, and I don't. What is the tool you used, and where do I find it, for my use, and the holder of the checkpoint files in question find it? |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Don't DC/LL them with CudaLucas | LaurV | Data | 131 | 2017-05-02 18:41 |
CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 | Brain | GPU Computing | 13 | 2016-02-19 15:53 |
CUDALucas: which binary to use? | Karl M Johnson | GPU Computing | 15 | 2015-10-13 04:44 |
settings for cudaLucas | fairsky | GPU Computing | 11 | 2013-11-03 02:08 |
Trying to run CUDALucas on Windows 8 CP | Rodrigo | GPU Computing | 12 | 2012-03-07 23:20 |