mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2011-10-05, 21:10   #144
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3·199 Posts
Default mfakto 0.09 - Linux version

Linux 64-bit
Attached Files
File Type: zip mfakto-0.09 - Linux.zip (105.4 KB, 181 views)
Bdot is offline   Reply With Quote
Old 2011-10-05, 21:12   #145
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3·199 Posts
Default mfakto 0.09 - sources

... and the source code
Attached Files
File Type: zip mfakto-0.09 - src.zip (137.4 KB, 206 views)
Bdot is offline   Reply With Quote
Old 2011-10-27, 00:05   #146
KyleAskine
 
KyleAskine's Avatar
 
Oct 2011
Maryland

2×5×29 Posts
Default

Hi!

I am new here, and I might have missed a point of discussion earlier in the thread. If that is the case, I am sorry.

Anyway, I have two GPUs in my current PC (6950s flashed as 6970s), but it looks like mfakto only uses one of them. Is this a known issue, or could there be a problem with my setup?

Also, I have 11.9, but to get one GPU to 90%, it took me two cores.

Thanks for your help!
KyleAskine is offline   Reply With Quote
Old 2011-10-27, 20:07   #147
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Default

Quote:
Originally Posted by KyleAskine View Post
Anyway, I have two GPUs in my current PC (6950s flashed as 6970s), but it looks like mfakto only uses one of them.
Did you already play around with the -d <dev-num> switch? This is supposed to let you decide which device an mfakto instance will use.

If that does not work, please send me the clinfo output (e.g. in C:\Program Files (x86)\AMD APP\bin\x86_64\clinfo.exe).

Quote:
Originally Posted by KyleAskine View Post
Also, I have 11.9, but to get one GPU to 90%, it took me two cores.
That is expected with higher-end cards. One mfakto process will always use only one device, so you may need 4 instances in total to get both GPUs to 90%.
Bdot is offline   Reply With Quote
Old 2011-10-28, 12:01   #148
KyleAskine
 
KyleAskine's Avatar
 
Oct 2011
Maryland

2×5×29 Posts
Default

Quote:
Originally Posted by Bdot View Post
Did you already play around with the -d <dev-num> switch? This is supposed to let you decide which device an mfakto instance will use.

If that does not work, please send me the clinfo output (e.g. in C:\Program Files (x86)\AMD APP\bin\x86_64\clinfo.exe).



That is expected with higher-end cards. One mfakto process will always use only one device, so you may need 4 instances in total to get both GPUs to 90%.
Thanks for your answers! I will play around with it today!
KyleAskine is offline   Reply With Quote
Old 2011-11-07, 19:19   #149
Ethan (EO)
 
Ethan (EO)'s Avatar
 
"Ethan O'Connor"
Oct 2002
GIMPS since Jan 1996

2·72 Posts
Default

Upgrading to the 11.10 driver on x64 Windows broke the mfakto 0.09 executable for me, because the kernel compiler in this driver version is hung up on calls to mad24 with mixed argument types.

Casting all of the integer constants in the mad24 calls to (uint) fixed this for me!

Edit: No it didn't -- this lets the executable run but it's failing the selftest. I've used up the time I can spend on this today unfortunately but there it is.

ReEdit: Only the 64bit build is failing the selftest.

i.e.
Code:
nn.d1 = mad24(mul_hi(n.d0, qi), (uint)256, tmp >> 24);
Also of note, I had to change the Platform Toolset setting to Windows7.1SDK from v100 to get this to build in Visual Studio Express.

Last fiddled with by Ethan (EO) on 2011-11-07 at 19:33
Ethan (EO) is offline   Reply With Quote
Old 2011-11-07, 19:47   #150
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3·199 Posts
Default

Quote:
Originally Posted by Ethan (EO) View Post
Upgrading to the 11.10 driver on x64 Windows broke the mfakto 0.09 executable for me, because the kernel compiler in this driver version is hung up on calls to mad24 with mixed argument types.
Uh-oh ... every new version adds new surprises ... With that I'm afraid to upgrade my drivers ;-)

I'll see if I can do something about it ...
Bdot is offline   Reply With Quote
Old 2011-11-07, 21:58   #151
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

722110 Posts
Default

On another note Bdot, on the FAQ PDF available in the FAQ threads, it says not to report no factor results from mfakto. I don't know why it says that, but someone somewhere said it was the factors < 2^48, which has been fixed. If that was the reason why, please tell Brain to fix the PDF. I just hope we haven't lost too much work.
Dubslow is offline   Reply With Quote
Old 2011-11-08, 06:12   #152
Brain
 
Brain's Avatar
 
Dec 2009
Peine, Germany

331 Posts
Default

PDF going to be updated... Do submit all results.
Brain is offline   Reply With Quote
Old 2011-11-10, 15:37   #153
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3·199 Posts
Default Fix for 11.10?

Quote:
Originally Posted by Ethan (EO) View Post
Upgrading to the 11.10 driver on x64 Windows broke the mfakto 0.09 executable for me.

Code:
nn.d1 = mad24(mul_hi(n.d0, qi), (uint)256, tmp >> 24);
I've replaced all of those constants by their uint equivalent (256 => 256u), and on my slow test box (W7-64) this seems to work, the small selftest succeeded, and so far it found all factors of the full selftest - still running.

Code:
nn.d1 = mad24(mul_hi(n.d0, qi), 256u, tmp >> 24);
I've attached the kernel file. Could you please check if this one still fails the selftest on your machine?
Attached Files
File Type: zip mfakto_Kernels.zip (16.6 KB, 213 views)
Bdot is offline   Reply With Quote
Old 2011-11-10, 22:41   #154
Ethan (EO)
 
Ethan (EO)'s Avatar
 
"Ethan O'Connor"
Oct 2002
GIMPS since Jan 1996

2·72 Posts
Default

I'm still failing about half of the selftests with that kernel file. I'm going to revert to the exact contents of your 0.09 src zip to make sure I haven't mucked anything up in the project settings.


Ethan

edit: No luck -- unziped your src file directly, put the updated cl file in src, built Release/x64, and ran. No runtime cl compilation errors, but -- aha -- just noticed that it is passing the first test in each test case, and then failing the rest:

Code:
########## testcase 6/1558 ##########
tf(53134687, 68, 69, ...);
 k_min = 2999999998380 -  k_max = 3300000000000
Using GPU kernel "mfakto_cl_71_8"
    class | candidates |    time | avg. rate | SievePrimes |    ETA | avg. wait
3120/4620 |     14.16M |  0.468s |  30.25M/s |       25000 |   n.a. |   10798us
Result[00]: M53134687 has a factor: 337073926433410950601
found 1 factor(s) for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_71_
8]
selftest for M53134687 passed (mfakto_cl_71_8)!
tf(): total time spent:  0.487s

tf(53134687, 68, 69, ...);
 k_min = 2999999998380 -  k_max = 3300000000000
Using GPU kernel "mfakto_cl_71_4"
    class | candidates |    time | avg. rate | SievePrimes |    ETA | avg. wait
3120/4620 |     14.16M |  0.215s |  65.84M/s |       25000 |   n.a. |       0us
no factor for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_71_4]
ERROR: selftest failed for M53134687 (mfakto_cl_71_4)
  no factor found
tf(): total time spent:  0.234s

tf(53134687, 68, 69, ...);
 k_min = 2999999998380 -  k_max = 3300000000000
Using GPU kernel "mfakto_cl_barrett79"
    class | candidates |    time | avg. rate | SievePrimes |    ETA | avg. wait
3120/4620 |     14.16M |  0.214s |  66.15M/s |       25000 |   n.a. |       0us
no factor for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
ERROR: selftest failed for M53134687 (mfakto_cl_barrett79)
  no factor found
tf(): total time spent:  0.233s

tf(53134687, 68, 69, ...);
 k_min = 2999999998380 -  k_max = 3300000000000
Using GPU kernel "mfakto_cl_barrett92"
    class | candidates |    time | avg. rate | SievePrimes |    ETA | avg. wait
3120/4620 |     14.16M |  0.214s |  66.15M/s |       25000 |   n.a. |       0us
no factor for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett92]
ERROR: selftest failed for M53134687 (mfakto_cl_barrett92)
  no factor found
tf(): total time spent:  0.232s
And that's consistent across the testcases.

reedit: The same executable runs without error on the CPU:

Code:
########## testcase 6/1558 ##########
tf(53134687, 68, 69, ...);
 k_min = 2999999998380 -  k_max = 3300000000000
Using GPU kernel "mfakto_cl_71_8"
    class | candidates |    time | avg. rate | SievePrimes |    ETA | avg. wait
3120/4620 |     14.68M |  5.737s |   2.56M/s |       25000 |   n.a. |  362964us
Result[00]: M53134687 has a factor: 337073926433410950601
found 1 factor(s) for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_71_
8]
selftest for M53134687 passed (mfakto_cl_71_8)!
tf(): total time spent:  5.753s

tf(53134687, 68, 69, ...);
 k_min = 2999999998380 -  k_max = 3300000000000
Using GPU kernel "mfakto_cl_71_4"
    class | candidates |    time | avg. rate | SievePrimes |    ETA | avg. wait
3120/4620 |     14.68M |  6.460s |   2.27M/s |       25000 |   n.a. |  410991us
Result[00]: M53134687 has a factor: 337073926433410950601
found 1 factor(s) for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_71_
4]
selftest for M53134687 passed (mfakto_cl_71_4)!
tf(): total time spent:  6.479s

tf(53134687, 68, 69, ...);
 k_min = 2999999998380 -  k_max = 3300000000000
Using GPU kernel "mfakto_cl_barrett79"
    class | candidates |    time | avg. rate | SievePrimes |    ETA | avg. wait
3120/4620 |     14.68M |  4.439s |   3.31M/s |       25000 |   n.a. |  276762us
Result[00]: M53134687 has a factor: 337073926433410950601
found 1 factor(s) for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_bar
rett79]
selftest for M53134687 passed (mfakto_cl_barrett79)!
tf(): total time spent:  4.459s

tf(53134687, 68, 69, ...);
 k_min = 2999999998380 -  k_max = 3300000000000
Using GPU kernel "mfakto_cl_barrett92"
    class | candidates |    time | avg. rate | SievePrimes |    ETA | avg. wait
3120/4620 |     14.68M |  5.800s |   2.53M/s |       25000 |   n.a. |  366906us
Result[00]: M53134687 has a factor: 337073926433410950601
found 1 factor(s) for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_bar
rett92]
selftest for M53134687 passed (mfakto_cl_barrett92)!
tf(): total time spent:  5.822s
...and the 32bit build runs fine on both CPU and GPU.

Last fiddled with by Ethan (EO) on 2011-11-10 at 22:56 Reason: updating information
Ethan (EO) is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
gpuOwL: an OpenCL program for Mersenne primality testing preda GpuOwl 2938 2023-06-30 14:04
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3628 2023-04-17 22:08
LL with OpenCL msft GPU Computing 433 2019-06-23 21:11
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
Program to TF Mersenne numbers with more than 1 sextillion digits? Stargate38 Factoring 24 2011-11-03 00:34

All times are UTC. The time now is 14:46.


Fri Jul 7 14:46:58 UTC 2023 up 323 days, 12:15, 0 users, load averages: 2.02, 1.48, 1.21

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔