mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2011-09-26, 20:04   #133
apsen
 
Jun 2011

131 Posts
Default

Quote:
Originally Posted by Bdot View Post
I´m sorry to report: yesterday I found a bug, mfakto up to 0.08 does not find the factor for k=3 for M6599953.
Does this affect 0.07 too?
apsen is offline   Reply With Quote
Old 2011-09-26, 23:10   #134
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

11258 Posts
Default

Quote:
Originally Posted by apsen View Post
Does this affect 0.07 too?
Yes. The code there is the same as 0.08 and I just tested it - the mentioned factor is not found by 0.07 either.
Bdot is offline   Reply With Quote
Old 2011-09-27, 03:26   #135
KingKurly
 
KingKurly's Avatar
 
Sep 2010
Annapolis, MD, USA

33·7 Posts
Default

Quote:
Originally Posted by Bdot View Post
I´m sorry to report: yesterday I found a bug, mfakto up to 0.08 does not find the factor for k=3 for M6599953.

The reason is an invalid "optimization" that I made over the mfaktc-code. Mfaktc does not have this problem. I have fixed the bug and added a test case for it to the selftests.

The mfakto kernel "mfakto_cl_71" (all vector sizes) sometimes calculated a bad modulus when the factor candidate was <248. Smaller FCs (~224) had a higher chance for the error to occur, FCs >248 were always calculated correctly. The problem does not depend on the exponent size.

I´m sorry for possibly having wasted effort and resources, but I hope it´s not too many tests that need to be repeated as it´s only about small FCs. I will provide a fixed version within the next few days.
So tests like "no factor for M43207903 from 2^68 to 2^69 [mfakto 0.08 mfakto_cl_barrett79]" would be unaffected. A test would have to be for much lower bit levels to be affected?
KingKurly is offline   Reply With Quote
Old 2011-09-27, 07:20   #136
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Default

Quote:
Originally Posted by KingKurly View Post
So tests like "no factor for M43207903 from 2^68 to 2^69 [mfakto 0.08 mfakto_cl_barrett79]" would be unaffected. A test would have to be for much lower bit levels to be affected?
Yes. If you did the test like "no factor for M43207903 from 2^1 to 2^69 [mfakto 0.08 ...]" then the test might have missed factors that are below 248, and a safety test would need to be done from 21 to 248 (with mfakto 0.09, or mfaktc).

Furthermore, tests done by the barrett kernel "[mfakto 0.08 mfakto_cl_barrett79]" are not affected, but this kernel cannot and will not be used for low factor sizes anyway.

So affected are:
- version 0.07 and 0.08 of mfakto
- all tests that have been run by the "mfakto_cl_71" kernel AND where the starting bit level is <48.

Last fiddled with by Bdot on 2011-09-27 at 07:44 Reason: Summary
Bdot is offline   Reply With Quote
Old 2011-09-27, 11:38   #137
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Liverpool (GMT/BST)

32×5×7×19 Posts
Default

I am pretty certain all of the GIMPS candidates have been tested past 2^48 using Prime95 so this shouldn't be an issue. mfakto just didn't double check that range.
henryzz is offline   Reply With Quote
Old 2011-09-27, 14:18   #138
KingKurly
 
KingKurly's Avatar
 
Sep 2010
Annapolis, MD, USA

33×7 Posts
Default

Quote:
Originally Posted by henryzz View Post
I am pretty certain all of the GIMPS candidates have been tested past 2^48 using Prime95 so this shouldn't be an issue. mfakto just didn't double check that range.
Correct, and since I haven't checked any exponents for factors below 2^48 using mfakto, I can confidently say that my results are unaffected. Still, good catch, and I look forward to 0.09 soon.
KingKurly is offline   Reply With Quote
Old 2011-09-29, 22:10   #139
DigiK-oz
 
Jul 2008

23·3 Posts
Default

Quote:
Originally Posted by Bdot View Post
They seem to have implemented some kind of busy-wait (futex-based) whenever something needs to be synchronized with the GPU. As this is usually the CPU just waiting for the GPU to complete something, that is a total waste of CPU resources.

However, mfakto is not hit that badly as mfakto passes the prepared factor candidates to the GPU but does not wait for the results immediately. Instead, the next block of factor candidates is prepared on the CPU. Only when the CPU is faster preparing the stuff than the GPU can process it, then mfakto will synchronize with the GPU. And of course at the end of a class.

So yes, mfakto will also consume a full CPU core, but it will do something useful most of that time.
Well, mfakto used to eat 2 full cores alongside the GPU (1 thread for mfakto itself, 1 thread somewhere in an ATI dll), but since the 11.9 drivers the only thread eating CPU is mfakto itself! So the guys at ATI seem to have fixed their drivers in that respect :)
DigiK-oz is offline   Reply With Quote
Old 2011-09-30, 05:39   #140
Samoflan
 
Jan 2010

5 Posts
Default

ATI drivers 11.9 seem to have increased the performance mfakto 0.08 slightly, by almost 2% on my Radeon HD4870. CPU utilization is still the same on my Phenom II x4 955 at about 47% across all 4 cores. Video card seems to stay at a consistant 95% load now instead of fluxing from 91-95%
Samoflan is offline   Reply With Quote
Old 2011-09-30, 05:55   #141
DigiK-oz
 
Jul 2008

1816 Posts
Default

Strange, the 11.9 drivers brought down CPU usage on my I7 920 with hyperthreading from almost 25% (=2 cpus) to about 12% (=1 cpu)... With the 11.8 drivers, a thread in some ATI dll used 12%, as well as mfakto itself. With 11.9, the only thread using 12% is mfakto. The thread in the ATI dll is still there, but sits at 0,07% cpu. Performance has stayed about the same.

Anyone else seeing this behaviour?

Last fiddled with by DigiK-oz on 2011-09-30 at 05:58
DigiK-oz is offline   Reply With Quote
Old 2011-09-30, 14:20   #142
jeebee
 
Sep 2011

2·3 Posts
Default

Quote:
Originally Posted by DigiK-oz View Post
Strange, the 11.9 drivers brought down CPU usage on my I7 920 with hyperthreading from almost 25% (=2 cpus) to about 12% (=1 cpu)... With the 11.8 drivers, a thread in some ATI dll used 12%, as well as mfakto itself. With 11.9, the only thread using 12% is mfakto. The thread in the ATI dll is still there, but sits at 0,07% cpu. Performance has stayed about the same.

Anyone else seeing this behaviour?

I have a similar experience. 11.9 seems like a major improvement upon 11.8. On a 2500k & HD6780, the old drivers needed 2 cores to output roughly 140mb/s. The newer driver delivers about 130mb/s with only one core running. I've thus decided to devote the third core to p95.
jeebee is offline   Reply With Quote
Old 2011-10-05, 21:07   #143
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Default mfakto 0.09 - Windows versions

Quote:
Originally Posted by jeebee View Post
I have a similar experience. 11.9 seems like a major improvement upon 11.8. On a 2500k & HD6780, the old drivers needed 2 cores to output roughly 140mb/s. The newer driver delivers about 130mb/s with only one core running. I've thus decided to devote the third core to p95.
Yes, the upgrade to 11.9 is certainly recommended, way lower CPU usage.

I think it is time to release the fix to the previously reported bug. I played around with a few ideas to fix the bug without affecting performance, but I did not have time to do it right. Therefore, the fixed 72-bit kernel of version 0.09 will be 3-5% slower than 0.08. The barrett kernels are not affected. I'm working on getting the same speed as before, but that will take some more time.

So here is version 0.09, first Windows ...
Attached Files
File Type: zip mfakto-0.09 - Win.zip (190.4 KB, 249 views)

Last fiddled with by Bdot on 2011-10-05 at 21:10 Reason: typo
Bdot is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
gpuOwL: an OpenCL program for Mersenne primality testing preda GpuOwl 2760 2022-05-15 00:00
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3541 2022-04-21 22:37
LL with OpenCL msft GPU Computing 433 2019-06-23 21:11
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
Program to TF Mersenne numbers with more than 1 sextillion digits? Stargate38 Factoring 24 2011-11-03 00:34

All times are UTC. The time now is 20:52.


Wed May 25 20:52:20 UTC 2022 up 41 days, 18:53, 1 user, load averages: 1.82, 1.79, 1.62

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔