mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2011-11-11, 08:48   #155
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Default

Quote:
Originally Posted by Ethan (EO) View Post
I'm still failing about half of the selftests with that kernel file.
...

edit: No luck -- unziped your src file directly, put the updated cl file in src, built Release/x64, and ran. No runtime cl compilation errors, but -- aha -- just noticed that it is passing the first test in each test case, and then failing the rest:

Code:
########## testcase 6/1558 ##########
tf(53134687, 68, 69, ...);
 k_min = 2999999998380 -  k_max = 3300000000000
Using GPU kernel "mfakto_cl_71_8"
    class | candidates |    time | avg. rate | SievePrimes |    ETA | avg. wait
3120/4620 |     14.16M |  0.468s |  30.25M/s |       25000 |   n.a. |   10798us
Result[00]: M53134687 has a factor: 337073926433410950601
found 1 factor(s) for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_71_
8]
selftest for M53134687 passed (mfakto_cl_71_8)!
tf(): total time spent:  0.487s

tf(53134687, 68, 69, ...);
 k_min = 2999999998380 -  k_max = 3300000000000
Using GPU kernel "mfakto_cl_71_4"
    class | candidates |    time | avg. rate | SievePrimes |    ETA | avg. wait
3120/4620 |     14.16M |  0.215s |  65.84M/s |       25000 |   n.a. |       0us
no factor for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_71_4]
ERROR: selftest failed for M53134687 (mfakto_cl_71_4)
  no factor found
tf(): total time spent:  0.234s

tf(53134687, 68, 69, ...);
 k_min = 2999999998380 -  k_max = 3300000000000
Using GPU kernel "mfakto_cl_barrett79"
    class | candidates |    time | avg. rate | SievePrimes |    ETA | avg. wait
3120/4620 |     14.16M |  0.214s |  66.15M/s |       25000 |   n.a. |       0us
no factor for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
ERROR: selftest failed for M53134687 (mfakto_cl_barrett79)
  no factor found
tf(): total time spent:  0.233s

tf(53134687, 68, 69, ...);
 k_min = 2999999998380 -  k_max = 3300000000000
Using GPU kernel "mfakto_cl_barrett92"
    class | candidates |    time | avg. rate | SievePrimes |    ETA | avg. wait
3120/4620 |     14.16M |  0.214s |  66.15M/s |       25000 |   n.a. |       0us
no factor for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett92]
ERROR: selftest failed for M53134687 (mfakto_cl_barrett92)
  no factor found
tf(): total time spent:  0.232s
And that's consistent across the testcases.
There should be no need to rebuild anything - just replace the kernel file next to the original 0.09 binary. On the other hand, rebuilding should not hurt.

I'll update my home PC over the weekend, maybe I can reproduce the error there. The symptom looks a bit like something not initialized in the correct place ... and now it depends on memory layout or other rather random things.
Bdot is offline   Reply With Quote
Old 2011-11-11, 20:28   #156
Ethan (EO)
 
Ethan (EO)'s Avatar
 
"Ethan O'Connor"
Oct 2002
GIMPS since Jan 1996

2·72 Posts
Default

Quote:
Originally Posted by Bdot View Post
There should be no need to rebuild anything - just replace the kernel file next to the original 0.09 binary.
Yeah -- I just rebuilt from your unaltered project to make sure I hadn't messed something up on the executable I had built previously :)


Ethan
Ethan (EO) is offline   Reply With Quote
Old 2011-11-12, 00:53   #157
Ethan (EO)
 
Ethan (EO)'s Avatar
 
"Ethan O'Connor"
Oct 2002
GIMPS since Jan 1996

2×72 Posts
Default

I reordered the test cases to see if the failure pattern was the same, and it turns out that the order of the kernels within a testcase is irrelevant -- mfakto_cl_71_4, mfakto_cl_barrett79, and mfakto_cl_barrett92 are failing, but mfakto_cl_71_8 is working.
Ethan (EO) is offline   Reply With Quote
Old 2011-11-12, 01:31   #158
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

5·223 Posts
Default

Hello,

just a shot into the dark: The average wait is 0 when the known factor is not found: does the GPU-kernel run at all?

Oliver

Last fiddled with by TheJudger on 2011-11-12 at 01:31
TheJudger is offline   Reply With Quote
Old 2011-11-12, 01:47   #159
Ethan (EO)
 
Ethan (EO)'s Avatar
 
"Ethan O'Connor"
Oct 2002
GIMPS since Jan 1996

2×72 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Hello,

just a shot into the dark: The average wait is 0 when the known factor is not found: does the GPU-kernel run at all?

Oliver
Yep -- they are running. Just turned on the kernel tracing stuff in the OpenCL kernels, and I've found a difference:

32 bit build cl_71_4:
Code:
########## testcase 1/1558 ##########
tf(50804297, 67, 68, ...);
 k_min = 1599999998520 -  k_max = 1900000000000
Using GPU kernel "mfakto_cl_71_4"
    class | candidates |    time | avg. rate | SievePrimes |    ETA | avg. wait
mfakto_cl_71: tid=0: p=3073649, *2 =6:e6c92, k=0, 0, 0, 0:17487, 17487, 17487, 1
7487:6e8773, 6ef3bb, 6f05c7, 6f3beb, f=8d029, 8d029, 8d029, 8d02a:fccff8, ff5fc2
, ffcd0e, 114f3:77c397, 53e4a7, a33f7f, 915007, shift=19, b=0, 0, 0, 0:1, 1, 1,
1:0, 0, 0, 0:0, 0, 0, 0:0, 0, 0, 0:0, 0, 0, 0
mod_144_72#1: qf=3.51844E+013, nf=6.15105E-021, *=2.16421E-007, qi=0
mod_144_72#1: q=0:1:0:0:0:0, n=8d029:fccff8:77c397, qi=0
mod_144_72#1.1: nn=0:0:0:0:0:0
mod_144_72#1.2: nn=0:0:0:0:0:0
mod_144_72#1.3: nn=0:0:0:0:0:0Error: The arguments don't match the printf format
 string. printf(mod_144_72#1.3: nn=%x:%x:%x:%x:%x:%x
64bit build cl_71_4:
Code:
########## testcase 1/1558 ##########
tf(50804297, 67, 68, ...);
 k_min = 1599999998520 -  k_max = 1900000000000
Using GPU kernel "mfakto_cl_71_4"
    class | candidates |    time | avg. rate | SievePrimes |    ETA | avg. wait
mfakto_cl_71: tid=0: p=3073649, *2 =6:e6c92, k=0, 0, 0, 0:17487, 17487, 17487, 1
7487:6e8773, 6ef3bb, 6f05c7, 6f3beb, f=8d029, 8d029, 8d029, 8d02a:fccff8, ff5fc2
, ffcd0e, 114f3:77c397, 53e4a7, a33f7f, 915007, shift=19, b=0, 0, 0, 0:0, 0, 0,
0:0, 0, 0, 0:0, 0, 0, 0:0, 0, 0, 0:0, 0, 0, 0
mod_144_72#1: qf=0.000000, nf=6.15105E-021, *=0.000000, qi=0
mod_144_72#1: q=0:0:0:0:0:0, n=8d029:fccff8:77c397, qi=0
mod_144_72#1.1: nn=0:0:0:0:0:0
mod_144_72#1.2: nn=0:0:0:0:0:0
mod_144_72#1.3: nn=0:0:0:0:0:0Error: The arguments don't match the printf format
 string. printf(mod_144_72#1.3: nn=%x:%x:%x:%x:%x:%x
64 bit build cl_71_8:
Code:
########## testcase 1/1558 ##########
tf(50804297, 67, 68, ...);
 k_min = 1599999998520 -  k_max = 1900000000000
Using GPU kernel "mfakto_cl_71_8"
    class | candidates |    time | avg. rate | SievePrimes |    ETA | avg. wait
mfakto_cl_71: tid=0: p=3073649, *2 =6:e6c92, k=0, 0, 0, 0, 0, 0, 0, 0:17487, 174
87, 17487, 17487, 17487, 17487, 17487, 17487:6e8773, 6ef3bb, 6f05c7, 6f3beb, 6fd
e57, 70147b, 706eb7, 71c59b, f=8d029, 8d029, 8d029, 8d02a, 8d02a, 8d02a, 8d02a,
8d02a:fccff8, ff5fc2, ffcd0e, 114f3, 4eca2, 63487, 85704, 1073ae:77c397, 53e4a7,
 a33f7f, 915007, 5b819f, 499227, d6585f, ba1667, shift=19, b=0, 0, 0, 0, 0, 0, 0
, 0:1, 1, 1, 1, 1, 1, 1, 1:0, 0, 0, 0, 0, 0, 0, 0:0, 0, 0, 0, 0, 0, 0, 0:0, 0, 0
, 0, 0, 0, 0, 0:0, 0, 0, 0, 0, 0, 0, 0
mod_144_72#1: qf=3.51844E+013, nf=6.15105E-021, *=2.16421E-007, qi=0
mod_144_72#1: q=0:1:0:0:0:0, n=8d029:fccff8:77c397, qi=0
mod_144_72#1.1: nn=0:0:0:0:0:0
mod_144_72#1.2: nn=0:0:0:0:0:0
mod_144_72#1.3: nn=0:0:0:0:0:0Error: The arguments don't match the printf format
 string. printf(mod_144_72#1.3: nn=%x:%x:%x:%x:%x:%x
I haven't worked back yet to see if b is correct in the caller in the 64bit/v4 case... and haven't finished changing the kernels' printf format strings to vector types as you can see :)

Last fiddled with by Ethan (EO) on 2011-11-12 at 02:08
Ethan (EO) is offline   Reply With Quote
Old 2011-11-12, 16:04   #160
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

5×223 Posts
Default

qf = 0.00000 doesn't look good.
  • precomputation failed
  • floatingpoint conversion failed (unlikely?)
  • data transfer doesn't work / isn't finished
  • something else
TheJudger is offline   Reply With Quote
Old 2011-11-12, 19:40   #161
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3·199 Posts
Default

Quote:
Originally Posted by TheJudger View Post
qf = 0.00000 doesn't look good.
  • precomputation failed
  • floatingpoint conversion failed (unlikely?)
  • data transfer doesn't work / isn't finished
  • something else
I did not yet change all the trace statements to work for vectors. The kernel trace is only accurate when tracing non-vectored kernels. That´s also the reason for the "arguments don´t match" message.

Looks like some work to do ...

Edit: And the average wait can be zero for mfakto because the necessary wait time for the last block of a class is not included in the calculation (one of the differences to the earlier mfaktc versions, to work better on small classes).

Last fiddled with by Bdot on 2011-11-12 at 19:43
Bdot is offline   Reply With Quote
Old 2011-11-12, 23:14   #162
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Angry

The kernels do not receive the input parameter that holds the pre-processing information, but get a zero there.

With the kernel tracing fixed and set to at least level 3, the mfakto_cl_71_4 kernel will receive the correct parameters and find the factors. So far I did not get the barrett kernels to receive all input parameters.

My guess is that the optimizer removed them as it did not deem them important. But trying to build the kernel non-optimized crashes the kernel compiler.

In the light of this it is probably not helping that the barrett kernels are ~4% faster with 11.10. Probably because crucial parts have been optimized away.

I guess we just need to skip the Catalyst 11.10 version :-(
Bdot is offline   Reply With Quote
Old 2011-11-13, 13:32   #163
bcp19
 
bcp19's Avatar
 
Oct 2011

12478 Posts
Default

I just recently got myself an HD 6770 and picked up mfakto .09 and when I try to run the 64 bit windows exe I get multiple errors about too many instances of mad24 and then a message saying there were 27 errors and the program shutdown. (paraphrasing as I am not sitting AT the machine atm) If I run the 32 bit exe, everything appears to run normal. I have 11.9 drivers installed as I read on here about the problems with 11.10.
bcp19 is offline   Reply With Quote
Old 2011-11-13, 21:24   #164
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Default

Quote:
Originally Posted by bcp19 View Post
I just recently got myself an HD 6770 and picked up mfakto .09 and when I try to run the 64 bit windows exe I get multiple errors about too many instances of mad24 and then a message saying there were 27 errors and the program shutdown. (paraphrasing as I am not sitting AT the machine atm) If I run the 32 bit exe, everything appears to run normal. I have 11.9 drivers installed as I read on here about the problems with 11.10.
Code:
Select device - Get device info - Compiling kernels.
        BUILD OUTPUT
C:\Users\root\AppData\Local\Temp\OCLCEF5.tmp.cl(2192): error: more than one
          instance of overloaded function "mad24" matches the argument list:
            function "mad24(int, int, int) C++"
            function "mad24(uint, uint, uint) C++"
            argument types are: (uint, int, uint)
    *res_hi  = mad24(mul_hi(a,b), 256, (*res_lo >> 24));
               ^
...

C:\Users\root\AppData\Local\Temp\OCLCEF5.tmp.cl(2726): error: more than one
          instance of overloaded function "mad24" matches the argument list:
            function "mad24(int, int, int) C++"
            function "mad24(uint, uint, uint) C++"
            argument types are: (uint, int, uint)
    nn.d2 = mad24(mul_hi(n.d1, qi), 256, tmp >> 24);
            ^

27 errors detected in the compilation of "C:\Users\root\AppData\Local\Temp\OCLCEF5.tmp.cl".

Internal error: clc compiler invocation failed.

        END OF BUILD OUTPUT
Error -11: clBuildProgram
init_CL(5, 0) failed
This is exactly the error that on my machine started appearing with the installation of Catalyst 11.10. These compilation errors are easy to solve, but mfakto will still fail the selftest as there are other bugs in the compiled kernel.

I tried deinstalling 11.10 and went back as far as 11.6 - the errors remain. It's not the first time that the ATI drivers do not correctly deinstall themselves. Maybe they do but some hardware switch remained in a bad position. Anyway: the sad result is: once in that state, I could not get out. (I cannot try reinstalling the machine.)

I'll see if I can build an "11.10-workaround-version" for trapped folks like me. There will certainly be a performance-penalty. Where it still works, it's probably faster to run the 32-bit version for now - on my machine the 32-bit version fails as well.

Strange, strange, strange. Maybe there's still a bug in the main program that just has these side effects.
Bdot is offline   Reply With Quote
Old 2011-11-13, 21:27   #165
bcp19
 
bcp19's Avatar
 
Oct 2011

7×97 Posts
Default

Hmmm, does that mean the exp's I've been doing on the 32 bit client are suspect?
bcp19 is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
gpuOwL: an OpenCL program for Mersenne primality testing preda GpuOwl 2938 2023-06-30 14:04
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3628 2023-04-17 22:08
LL with OpenCL msft GPU Computing 433 2019-06-23 21:11
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
Program to TF Mersenne numbers with more than 1 sextillion digits? Stargate38 Factoring 24 2011-11-03 00:34

All times are UTC. The time now is 14:46.


Fri Jul 7 14:46:59 UTC 2023 up 323 days, 12:15, 0 users, load averages: 2.02, 1.48, 1.21

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔