![]() |
That's not quite what I was suggesting. That ability would be more appropriate for a misfit type program.
My question comes down to this: According to GPUto72, would I still own the assignment after turning in a trial factoring result, or would continuing with a p-1 test risk stepping on someones toes? I'll give you advance notice if I ever plan to be in Barbados. But rather than the gap, maybe you could suggest a nice local place. My wife and I don't go much for nightlife and fancy dining. |
[QUOTE=owftheevil;354191]My question comes down to this: According to GPUto72, would I still own the assignment after turning in a trial factoring result, or would continuing with a p-1 test risk stepping on someones toes?[/QUOTE]
Sorry for the latency on this -- missed the reply. To answer your question, what I had envisioned was something along the lines is if a candidate was assigned for P-1, "Spidy" would also watch for any additional TF work done on it. If so, it would check to see who had done the additional TFing, and credit the account for that while continuing to watch for the P-1 completion. As in, to be explicit, yes, the person assigned the candidate as P-1 would "keep it" until the P-1 work was completed, even if additional TFing was done. [QUOTE=owftheevil;354191]I'll give you advance notice if I ever plan to be in Barbados. But rather than the gap, maybe you could suggest a nice local place. My wife and I don't go much for nightlife and fancy dining.[/QUOTE] Absolutely. There are some very nice boutique hotels and B&B's around. And I agree with you -- "The Gap" is a bit "party central" and I don't recommend anyone stay in any of the hotels there because of the (late night) noise. But there are some very nice (not always the same as "fancy") restaurants there. |
[quote]No GeForce GTX 670 threads.txt file found. Using default thread sizes.
For optimal thread selection, please run ./CUDALucas -cufftbench 17496 17496 r for some small r, 0 < r < 6 e.g. Using threads: norm1 256, mult 256, norm2 128.[/quote]Should the text there be changed from "CUDALucas" to "CUDAPm1"? |
How about a global source code search and replace from "CUDALucas" to "CUDAPm1" ?
It's in the ini file too. |
Thanks. Done, except for the ini file. It needs a complete rewrite anyway.
|
I have a few questions:
1. Has stage 2 saving been implemented? 2. Does this program output Prime95-style timestamps? 3. Considering that GPUs are much faster than CPUs, would it be reasonable to use larger B1 and B2 values? Sorry if any of them have already been covered. |
[QUOTE=ixfd64;358173]I have a few questions:
1. Has stage 2 saving been implemented? 2. Does this program output Prime95-style timestamps? 3. Considering that GPUs are much faster than CPUs, would it be reasonable to use larger B1 and B2 values? Sorry if any of them have already been covered.[/QUOTE] 1. Yes it is, and resuming works very nice (REALLY NICE!) for both stage 1 and stage 2, but [U]be careful[/U] and DO NOT DELETE the stage [B][U]1[/U][/B] saving files when resuming. I don't know if it is a bug, or it was intended to work like that, but if you want to resume a stage 2, even if you have the stage 2 checkpoints (so theoretically you don't need stage 1 checkpoints anymore), if stage 1 checkpoints are not found, the program (is a little bit stupid :razz: and it) will do stage 1 from scratch. I found this by mistake, because normally I keep all the stage 1 files, with big hopes that sometime in the future we will be able to EXTEND the B1 limit, which - in my opinion - is more important than resuming stage 2. As it is now, you can't extend B1, without doing all the stage 1 work from scratch. (edit: which is the case for Prime95 too, and it is a pity, because a lot of contributors wasted time to redo the P-1 stage 1 when they wanted to extend the limits. The main problem is that keeping P-1 huge checkpoint files on the server will take too much space and it will generate too much traffic). But well... regardless of my dreams, resuming both stage 1 and 2 works very nice in the actual implementation. 2. No. But you don't need those, anyhow manual reports can't parse them and you would need to manually edit your result files, which would be not recommended... 3. You still can specify your own B1 and B2 in the command line, just create a small batch. I do low expos (under 1M) to high limits in this way. The developers promised (see few posts above) a full rewritten ini file :razz:, we wait for the time when all B1, B2, b (the base, sometime specifying a base different of 3 might help, see my pari implementation), e, d, etc be allowed to be specified in the ini files. Dreaming on... BTW, same batch files with command lines for testing, same hardware, same limits, same everything, the new version shows "e=2" in the result files, where the old version used to show "e=6". There seems to be no difference in working. Stupid question: why? Also: (@owftheevil) the residues which are attached to the file names (you know my old fixed idea... can't get it out of my head... :razz:) are wrong. Compare them with displayed values. They are "one -c step" behind, showing not the residue reached after calculus, but the residue from which the calculus started. I found this the hard way: I was running one test with "-c 1000", because I didn't want to wait between the outputs (it was just for testing) and then I realized that the space on the disk is consumed too fast (due to many checkpoints), I switched to -c 10k for the second test (the comparison, witness, DC, whatever you want to call it). Of course, meantime I deleted all checkpoints from the first run which were not multiple of 10k... Big mistake! At the end the remaining files had the residue from iteration 9000, 19000, 29000, and so on, and the names did not match with the second run, which had the residues from 10k, 20k, etc, but attached to the wrong files (i.e. the 20k iteration checkpoint had the residue of 10k iteration, the 30k checkpoint had the residue of 20k iteration, and so on, versus the first run where the remaining-after-deletion files, the 10k iteration checkpoint had the residue of 9k iteration, the 20k checkpoint had the residue of 19k iteration, and so on). Interesting enough, the content off the files (when compared ignoring the file name) was identical, and correct. And to finish in a positive note: I really REALLY love the FFT and especially the threads tuning mechanism. BRILLIANT! You have to try it and see, to beleive. On a gtx580, you can get about 10% or more performance only from thread tuning, without touching the clocks or fft lengths. |
Thanks for the detailed reply. I hope these shortcomings will soon be eliminated.
Timestamps aren't [I]that[/I] essential, but they can be quite useful for investigating interruptions. For example, if I let the program run unsupervised for a few days and come back to see that it had crashed, the timestamp could quickly help me determine when the crash occurred. In any case, implementing timestamps probably wouldn't require more than a few lines of code. :smile: |
I use the date/time of the checkpoint files for that thing. Also (see few posts above), having a batch file like
:label cudapm1 goto label is very helpful, as the program may often crash due to a bug in memory allocation of the cuda55 drivers [edit: for my system, this seems to happen only for the card which drives the monitors, and not for the other cards, even if they do physx too, but SLI is disabled, and they are not connected to monitors]. Having the right ini file and starting from a cycling batch as above helps a lot in this situation, the program will restart and resume properly (tried already). |
@owftheevil re. extending B1: If it helps, I have a pari implementation of the algorithm, with all the calculus of the power differences for the small primes, etc.
|
[QUOTE=LaurV;358176](edit: which is the case for Prime95 too, and it is a pity, because a lot of contributors wasted time to redo the P-1 stage 1 when they wanted to extend the limits. The main problem is that keeping P-1 huge checkpoint files on the server will take too much space and it will generate too much traffic). [/QUOTE]
This is not true... see [URL="http://www.mersenneforum.org/showthread.php?p=40816#post40816"]here[/URL]. Also, from the P95 undoc.txt file: [CODE]By default P-1 work does not delete the save files when the work unit completes. This lets you run P-1 to a higher bound at a later date. You can force the program to delete save files by adding this line to prime.txt: KeepPminus1SaveFiles=0[/CODE]I have experimented with many settings. To extend B1, while B1 is in progress, just increase B1. If you know you'll want to increase B1 later, do B1=B2 and then save the file(s) once complete, noting the current B1 value. When you want to increase B1, just set B1=B2 > current B1 and P95 will continue. If B1 is very high, it may appear like it's not working, but just be patient. (To see the progress, just stop the worker and then restart, you'll see it's going quite fast to get back to where it left off). This allows you to get to a certain B1, run a B2 value and then raise B1 and do another B2 run. You can also continue to raise B1 on another system (or from another folder) and then run B2 again. |
| All times are UTC. The time now is 23:19. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.