mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   genefer/CUDA (https://www.mersenneforum.org/showthread.php?t=14297)

AG5BPilot 2011-12-28 23:51

Shoichiro,

While looking for the benchmark problem, I think I found a small error in the code.

Unless I don't understand what you're trying to do, the IDX macro has a mistake:

[quote]#define IDX(i) ((((i)>>(SHIFT*2+2))<<(SHIFT*2+2))+(((i)&((2<<SHIFT)*(2<<SHIFT)-1))>>(SHIFT+1))+((i&((2<<SHIFT)-1))<<(SHIFT+1)))[/quote]

I think this should be:

[quote]#define IDX(i) ((((i)>>(SHIFT*2+2))<<(SHIFT*2+2))+(((i)&((2<<SHIFT)*(2<<SHIFT)-1))>>(SHIFT+1))+(([color=red][b]([/b][/color]i[color=red][b])[/b][/color]&((2<<SHIFT)-1))<<(SHIFT+1)))[/quote]

There's a few places where you use an expression as an argument to the IDX macro, such as these:

[quote] zx = g_z[IDX(i*2)];
zy = g_z[IDX(i*2+1)];
ex = g_e1[IDX(i*2)];
ey = g_e1[IDX(i*2+1)];[/quote]

The code will still work because of the low precedence for the "&" operator, but the macro would be safer with the extra ( ) in there.

msft 2011-12-29 03:16

Hi ,AG5BPilot
Thank you for your review.
I'll fix it in next version.

rroonnaalldd 2011-12-29 05:10

1 Attachment(s)
Okay, i fixed this but i find no solution for the following: [QUOTE]GeneferCUDA.cu: In function âvoid check(Int32, UInt32, char*)â:
GeneferCUDA.cu:651: warning: format â%dâ expects type âintâ, but argument 5 has type âlong intâ
GeneferCUDA.cu:651: warning: format â%02dâ expects type âintâ, but argument 6 has type âlong intâ
GeneferCUDA.cu:651: warning: format â%02dâ expects type âintâ, but argument 7 has type âlong intâ
[/QUOTE]

Code::Blocks points me to the line 493: [QUOTE]static void check(const Int32 b, const UInt32 m, char *expectedResidue)[/QUOTE]

axn 2011-12-29 05:33

Change
[CODE] sprintf(str2, " (%d digits) (err = %.4f) (time = (long)%d:(long)%02d:(long)%02d) %.8s\n", dgts, maxErr, hours, minutes, seconds, asctime(today)+11);
[/CODE]
to
[CODE] sprintf(str2, " (%d digits) (err = %.4f) (time = %ld:%02ld:%02ld) %.8s\n", dgts, maxErr, hours, minutes, seconds, asctime(today)+11);
[/CODE]

AG5BPilot 2011-12-30 16:34

[QUOTE=msft;283766]
Is GeneferCUDA Ver 1.03 correct ?[/QUOTE]

Shoichiro,

I just realized I didn't understand your question correctly. (That was my fault. Your English was fine!) I went back and tested 1.03. It has the same problem. Here are the results.

[quote]C:\GeneferCUDA test\genefercuda.1.03>GeneferCUDA-X32.exe -b
GeneferCUDA 2.2.1 (CUDA) based on Genefer v2.2.1
Copyright (C) 2001-2003, Yves Gallot (v1.3)
Copyright (C) 2009, 2011 Mark Rodenkirch, David Underbakke (v2.2.1)
Copyright (C) 2010, 2011, Shoichiro Yamada (CUDA)
A program for finding large probable generalized Fermat primes.

Generalized Fermat Number Bench
2009574^8192+1 Time: 398 us/mul. Err: 3.82e-001 51636 digits
1632282^16384+1 Time: 420 us/mul. Err: 2.53e-001 101791 digits
1325824^32768+1 Time: 451 us/mul. Err: 1.88e-001 200622 digits
1076904^65536+1 Time: 590 us/mul. Err: 1.72e-001 395325 digits
874718^131072+1 Time: 716 us/mul. Err: 3.47e-001 778813 digits
710492^262144+1 Time: 944 us/mul. Err: 4.21e-001 1533952 digits
577098^524288+1 Time: 1.51 ms/mul. Err: 2.01e-001 3020555 digits
468750^1048576+1 Time: 2.31 ms/mul. Err: 1.56e-001 5946413 digits
380742^2097152+1 Time: 230 us/mul. Err: 3.63e-001 11703432 digits
309258^4194304+1 Time: 324 us/mul. Err: 1.48e-001 23028076 digits
251196^8388608+1 Time: 273 us/mul. Err: 1.41e-001 45298590 digits[/quote]

Same problem, which isn't surprising. Forcing SHIFT to stay at 8 fixes the problem.

I'm currently running 4000^2097152+1 through GeneferCUDA 1.04 to see if the residual matches GeneferCUDA 0.99. This will take a few days. I should know the answer in 2012.

In other news, I've got the Windows BOINC development environment running, including the sample BOINC-CUDA application. I'm going to try turning GeneferCUDA into a native BOINC application. This may be an adventure. I'll let you know how this goes. ;-)

msft 2011-12-31 03:37

1 Attachment(s)
Hi ,AG5BPilot
Changed around "SHIFT".

AG5BPilot 2011-12-31 14:51

1.042:

[quote]Generalized Fermat Number Bench
2009574^8192+1 Time: 396 us/mul. Err: 3.82e-001 51636 digits
1632282^16384+1 Time: 419 us/mul. Err: 2.53e-001 101791 digits
1325824^32768+1 Time: 451 us/mul. Err: 1.88e-001 200622 digits
1076904^65536+1 Time: 589 us/mul. Err: 1.72e-001 395325 digits
874718^131072+1 Time: 715 us/mul. Err: 3.47e-001 778813 digits
710492^262144+1 Time: 943 us/mul. Err: 4.21e-001 1533952 digits
577098^524288+1 Time: 1.5 ms/mul. Err: 2.01e-001 3020555 digits
468750^1048576+1 Time: 2.31 ms/mul. Err: 1.56e-001 5946413 digits
[color=red]380742^2097152+1 Time: 232 us/mul. Err: 3.63e-001 11703432 digits
309258^4194304+1 Time: 320 us/mul. Err: 1.48e-001 23028076 digits
251196^8388608+1 Time: 281 us/mul. Err: 1.41e-001 45298590 digits[/color][/quote]

Doesn't fix the problem in the benchmark. I'm still running a test to see if the calculations are ok, but that's going to take a few more days.

AG5BPilot 2011-12-31 16:40

Shoichiro,

Both 1.04 and 1.042 do the same thing, but the code in 1.04 is cleaner. If I can't figure out why it behaves differently, I'll make the Windows build force SHIFT to be 8. That's easier to do in 1.04.

I like 1.04 better.

Mike

msft 2011-12-31 17:16

1 Attachment(s)
Hi ,AG5BPilot
I guess Fixed issue.
Problem was use cpu time clock.
This version change to wall clock.
And reduce CPU Time (100% to 20%).
Happy New Year!

AG5BPilot 2011-12-31 22:23

[QUOTE=msft;284251]Hi ,AG5BPilot
I guess Fixed issue.
Problem was use cpu time clock.
This version change to wall clock.
And reduce CPU Time (100% to 20%).
Happy New Year![/QUOTE]

Happy New Year Shoichiro!

Unfortunately, I just tried 1.045 and something is very wrong with it. The GPU is running at less than 60% utilization at N=8192, and of course it's taking a lot longer to run the benchmarks. The utilization does go up as N increases.

The PRP test on 2030234^8192+1 takes about 1:10 on previous versions of GeneferCUDA. It took 2:01 with v1.045.

Mike

AG5BPilot 2011-12-31 23:05

[QUOTE=msft;284251]Hi ,AG5BPilot
I guess Fixed issue.
Problem was use cpu time clock.
This version change to wall clock.
And reduce CPU Time (100% to 20%).
Happy New Year![/QUOTE]

Shoichiro,

I just did a bunch of research. This is somewhat convoluted.

You were using clock(), which is *supposed* to return the CPU time. That made perfect sense in the CPU versions of Genefer, but not in the CUDA version. Therefore, switching to time() is the correct choice.

BUT....

The Microsoft implementation of clock() actually returns the WALL TIME. Their documentation is contradictory, on one page of the same document it says "CPU TIME" and on another page it says "WALL TIME". I tested it right now, and it does, indeed report WALL TIME. (This is with the Visual Studio 2005 compiler, although I've seen identical benchmark timings from versions of GeneferCUDA compiled with lots of compilers, so I suspect they all are reporting WALL TIME.)

So, switching to time() is the correct thing to do, but in practice it doesn't change anything on Windows because wall clock time was being used anyway.

By the way, even though the changes in 1.045 greatly slowed down GeneferCUDA, the benchmarks are now working:

[quote]Generalized Fermat Number Bench
2009574^8192+1 Time: 702 us/mul. Err: 3.82e-001 51636 digits
1632282^16384+1 Time: 717 us/mul. Err: 2.53e-001 101791 digits
1325824^32768+1 Time: 763 us/mul. Err: 1.88e-001 200622 digits
1076904^65536+1 Time: 977 us/mul. Err: 1.72e-001 395325 digits
874718^131072+1 Time: 1.1 ms/mul. Err: 3.47e-001 778813 digits
710492^262144+1 Time: 1.46 ms/mul. Err: 4.21e-001 1533952 digits
577098^524288+1 Time: 2.44 ms/mul. Err: 2.01e-001 3020555 digits
468750^1048576+1 Time: 3.91 ms/mul. Err: 1.56e-001 5946413 digits
380742^2097152+1 Time: 7.81 ms/mul. Err: 3.63e-001 11703432 digits
309258^4194304+1 Time: 19.5 ms/mul. Err: 1.48e-001 23028076 digits
251196^8388608+1 Time: 39.1 ms/mul. Err: 1.41e-001 45298590 digits[/quote]


All times are UTC. The time now is 05:55.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.