mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

LaurV 2016-02-04 02:18

[QUOTE=Prime95;425093]Please add code to exit with an error message if when you write a save file (or any other convenient time) you find that the LL iteration is zero or two.[/QUOTE]

[QUOTE=msft;425123]Thank you very much.
I need few days.[/QUOTE]

[QUOTE=firejuggler;425135]0x0000000000000000 res is valid if it is the[U][B] last[/B][/U] itération[/QUOTE]

The perfect place to check for errors is still the point where the checkpoints are saved, as George said. For the record, the last iteration residue is never checkpointed, (unless you use a factor of p-1 as parameter for -c in command line). Checkpoints are every 100k or a million iterations, etc. So, no worry.

However, there are cases where saving checkpoints is disabled (in the ini file, for example). No idea how to handle it. I assume nobody uses "no checkpoints" if he is sane, so it can be ignored.

Also, the residue mod 16 hex digits is already calculated in that point, because it is used as the name for the checkpoint file. And [B][U]please do not change that![/U][/B] I mean [B][U]DO NOT CHANGE THAT![/U][/B] We need the file names to keep the same structure, as we run tests in parallel in different cards and compare the output files [U]by name[/U] (because their content is different, due to the different shifting - this is also how we check if the shifting is different, because the program never say/outputs the shift, until the test is finished and the shift is written in the result file - we don't want to find ourselves in the situation where we run a LL+DC test for weeks, to find at the end they used the same shift). So, "if I didn't say it, I will say it again"[sup](TM)[/sup]: please-please, do not change this procedure.

[if this comment seems rant, it is not, I made the request because I know that the implementation of it was not done by you (msft). You did it initially different, and I had to argue with the other guys in the project to get it like this, and am afraid that you may (unintentionally or intentionally) revert to the original code]

msft 2016-02-04 17:43

[QUOTE=LaurV;425154]
[if this comment seems rant, it is not, I made the request because I know that the implementation of it was not done by you (msft). You did it initially different, and I had to argue with the other guys in the project to get it like this, and am afraid that you may (unintentionally or intentionally) revert to the original code][/QUOTE]
If Dubslow(or flashjh, owftheevil) would fix this issue,I would be happy.:smile:

Madpoo 2016-02-04 21:53

[QUOTE=LaurV;425151]Poacher! :razz:
(it was legally assigned 2 days ago!)[/QUOTE]

Caught me. :smile:

Although technically mine was a triple-check. The faulty "is prime" result doesn't get added to the database when checked in via the manual result page. It triggers an email for verification before adding any records.

Automatic check-ins from a client have a similar safety valve to prevent weird results from showing up without a chance to eyeball them.

LaurV 2016-02-05 03:37

[QUOTE=msft;425228]If Dubslow(or flashjh, owftheevil) would fix this issue,I would be happy.:smile:[/QUOTE]
Ok with me. Also, because you are reading this topic now, and I know you don't read all the threads, the clLucas does not seem to output any shifts ever. Not on the screen, not in the result files. Is the shifting implemented there? (otherwise its value is only as double checker for exponents done by P95 - edit: btw, Madpoo, can you run a query there and see if any exponent has both LL and DC done by clLucas? If so, does it need a TC with another program/shift?).

Madpoo 2016-02-05 04:09

[QUOTE=LaurV;425273]Ok with me. Also, because you are reading this topic now, and I know you don't read all the threads, the clLucas does not seem to output any shifts ever. Not on the screen, not in the result files. Is the shifting implemented there? (otherwise its value is only as double checker for exponents done by P95 - edit: btw, Madpoo, can you run a query there and see if any exponent has both LL and DC done by clLucas? If so, does it need a TC with another program/shift?).[/QUOTE]

I could look, but I know that Prime95 only considers an exponent as 'verified' when it's been checked twice with matching resides *and* different shift counts.

The shift count may be zero with one of the tests, like one of the GPU apps that don't do that, but the other would need to be Prime95.

I know I looked a while back at old stuff to find any exponents where both checks had a zero shift count (legacy stuff) and seems like someone had already taken care of those. Perhaps George made them available for new DC's when the shift counts were implemented way back when.

LaurV 2016-02-05 05:16

Ok, we are good. I just reported a series of clLucas results (and have another 5 under DC) and can't "cheat". At least, not by mistake, or unintentionally. Double CL reports from the same user will be treated as "error xx, already submitted". My concern was what happens when two clLucas for the same exponent come from different users. It seems we don't have any.

msft 2016-02-05 06:54

[QUOTE=LaurV;425273]Ok with me. Also, because you are reading this topic now, and I know you don't read all the threads, the clLucas does not seem to output any shifts ever. Not on the screen, not in the result files. Is the shifting implemented there? (otherwise its value is only as double checker for exponents done by P95 - edit: btw, Madpoo, can you run a query there and see if any exponent has both LL and DC done by clLucas? If so, does it need a TC with another program/shift?).[/QUOTE]
Understand , I will fix clLucas issue.

msft 2016-02-05 07:33

clLucas code from cudalucas-code-37-trunk([URL="http://mersenneforum.org/showpost.php?p=417783&postcount=348"]http://mersenneforum.org/showpost.php?p=417783&postcount=348[/URL])
r37 not support shift.

TObject 2016-02-08 23:16

Finally, I got my grabby paws on a Titan. I have it coming in a few days; will be running on a Windows 10 64-bit machine.

Searching the forums for the recommended CUDALucas settings I gathered the following:

* CUDA version: 6.5
* 32-bit executable
* Threads: 256
* Splices: 256
* Memory clock: 2500 MHz

Some of these recommendations are fairly old; would you recommend something else?
Thanks.

airsquirrels 2016-02-09 02:04

[QUOTE=TObject;425688]Finally, I got my grabby paws on a Titan. I have it coming in a few days; will be running on a Windows 10 64-bit machine.

Searching the forums for the recommended CUDALucas settings I gathered the following:

* CUDA version: 6.5
* 32-bit executable
* Threads: 256
* Splices: 256
* Memory clock: 2500 MHz

Some of these recommendations are fairly old; would you recommend something else?
Thanks.[/QUOTE]

I'm not sure if my results are representative given that many of my cards are liquid cooled, however even with my AirCooled Titans the following has yielded the best performance and so-far flawless DCs (I did about 15 on each card before switching to strategic a expected not to match)

Threads + Splices 128, 128
CUDA 7.5
Latest driver can handle the stock clocks just fine at various air and liquid cooled temps.

You can use the -threadbench to find the best threads/splices values for the FFT size you will be working in

msft 2016-02-09 16:47

1 Attachment(s)
Hi,
Add residue check code.
Please review.

Ex.
[code]
Using threads: square 256, splice 128.
Starting M216091 fft length = 14K
| Date Time | Test Num Iter Residue | FFT Error ms/It Time | ETA Done |
| Feb 10 01:39:42 | M216091 10000 0x0000000000000002 | 14K 0.00011 0.3752 3.75s | 1:17 4.62% |
Illegal residue: 0x0000000000000002. See mersenneforum.org for help.
[/code]

[code]
unsigned long long check_illegal_residue(int *x_int, int q, int n, int offset)
{
int j, k = 0;
int digit, bit;
unsigned long long residue = 0;
int qn = q / n, carry = 0;
int lo = 1 << qn;
int hi = lo << 1;
int tx, temp;

digit = floor(offset * (n / (double) q));
bit = offset - ceil(digit * (q / (double) n));
j = (n + digit - 1) % n;
while(x_int[j] == 0 && j != digit)
{
j--;
if(j < 0) j += n;
}
if(j == digit && x_int[digit] == 0) return(0);
else if (x_int[j] < 0) carry = -1;
for(j = 0; j < n; j++)
{
tx = x_int[digit] + carry;
if (size(digit)) temp = hi;
else temp = lo;
if(tx < 0)
{
tx += temp;
carry = -1;
}
else carry = 0;
residue += (unsigned long long) tx << k;
k += q / n + size(digit);
if(j == 0)
{
k -= bit;
residue >>= bit;
}
if(k >= 64) break;
digit++;
if(digit == n) digit = 0;
}
return residue;
}
[/code]
[code]
if(residue == 0)
{
printf("Illegal residue: 0x0000000000000000. See mersenneforum.org for help.\n\n");
exit (2);
}
else
if(residue == 2 && check_illegal_residue(x_int, q, n, offset) == 2)
{
printf("Illegal residue: 0x0000000000000002. See mersenneforum.org for help.\n\n");
exit (2);
}
[/code]


All times are UTC. The time now is 22:59.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.