![]() |
Some data from tf_barrett96.cu: mod_simple_96():
[CODE] qi = 0 q = 00000007 3C3F1F[COLOR="Red"]20[/COLOR] C454D397 nn = 00000000 00000000 00000000 res = 00000007 3C3F1F[COLOR="Red"]1F[/COLOR] C454D397 [/CODE] res = q - nn; So for now it looks like CUDA 5.0.7 fails when somebody uses sub with carry when the subtrahend is 0. So for now it looks like a bug in CUDA 5.0.7. Oliver |
[QUOTE=TheJudger;304737]Some data from tf_barrett96.cu: mod_simple_96():
[CODE] qi = 0 q = 00000007 3C3F1F[COLOR="Red"]20[/COLOR] C454D397 nn = 00000000 00000000 00000000 res = 00000007 3C3F1F[COLOR="Red"]1F[/COLOR] C454D397 [/CODE] res = q - nn; So for now it looks like CUDA 5.0.7 fails when somebody uses sub with carry when the subtrahend is 0. So for now it looks like a bug in CUDA 5.0.7. Oliver[/QUOTE] Have you tried adding the volatile keyword to your asm statements? |
Nvidia confirmed the bug so I would say: not my fault/problem! :smile:
Oliver |
1 Attachment(s)
We submitted all of our completed work today in one file and were awarded some "extra credit". (See attached image.)
[CODE]P-1 found a factor in stage #2, B1=565000, B2=12147500, E=6. M56350163 has a factor: 24948611431313562132407 P-1 found a factor in stage #2, B1=540000, B2=11475000, E=6. M54203297 has a factor: 43709161575143787520913[/CODE] |
[QUOTE=Xyzzy;305027]We submitted all of our completed work today in one file and were awarded some "extra credit"[/QUOTE]I'm not even sure how to come up with those numbers... that's approx 150%-200% what credit you should get for those factors even as credited for TF. :unsure:
|
[QUOTE=James Heinrich;305028]I'm not even sure how to come up with those numbers... that's approx 150%-200% what credit you should get for those factors even as credited for TF. :unsure:[/QUOTE]
James, did you change the manual web forms along the lines we were discussing? If so, did the B1/B2 bounds get recorded correctly? Maybe, the underlying PHP guessed the wrong FFT size or we passed in a bogus FFT size? |
[QUOTE=Prime95;305032]James, did you change the manual web forms along the lines we were discussing? If so, did the B1/B2 bounds get recorded correctly? Maybe, the underlying PHP guessed the wrong FFT size or we passed in a bogus FFT size?[/QUOTE]No, I hadn't got to that yet (I was going to... the day PrimeNet was down for a few hours), the manual form is as yet unchanged.
|
[QUOTE=James Heinrich;305033]No, I hadn't got to that yet...[/QUOTE]
Weird. All the more reason to make those changes! |
20+% improvement
1 Attachment(s)
Oliver,
I propose creating a barrett77_mul32. This is the same as barrett79_mul32 but with the mod_simple_96 moved out of the loop. As long as f does not exceed 77 bits, a will not exceed 80 bits (above 80 bits and square_96_160 will fail). I tested this out and it passes the self tests up through 77 bits. Raw speed went from 205M/sec to 250M/sec. Crude source is attached. |
[QUOTE=Prime95;306572]Oliver,
....... I tested this out and it passes the self tests up through 77 bits. Raw speed went from 205M/sec to 250M/sec..... [/QUOTE] :w00t: Wow. |
More "extra credit":
[CODE]Processing result: M56505451 has a factor: 86553876518403762963169 CPU credit is 323.9309 GHz-days. Processing result: M56488651 has a factor: 35566445275259107720993 CPU credit is 129.5622 GHz-days. Processing result: M56491177 has a factor: 23502006329787341695151 CPU credit is 89.0731 GHz-days.[/CODE] |
| All times are UTC. The time now is 23:16. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.