mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2018-12-22, 23:39   #12
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

5·2,351 Posts
Default

Quote:
Originally Posted by chalsall View Post
Sweet! You are one of the most qualified to do this analysis. Sincerely.

Will you be taking into consideration the available compute, and its SP/DP ratios?
My main objective is to simply settle whether SP-based FFT-mul can be used at all for moduli of interest to GIMPS:

o If not, I will need to revisit my random-walk ROE heuristic to see where it falls short;

o If yes, and my heuristics re. FFT-length and modulus size are even close to what I observe in actual practice, an SP-based GPU LL test will be of immediate interest.

But for now, I suggest being pessimistic and assuming that Preda - the only person I know who has actually tried SP for such work - correctly concluded nonfeasibility for such an approach. At least that will be my attitude - expect the worst, but hope for a pleasant surprise.
ewmayer is offline   Reply With Quote
Old 2018-12-23, 00:31   #13
penlu
 
Jul 2018

31 Posts
Default

I have done only the very preliminary check of using SP library FFT rather than DP library FFT in CUDALucas -- this failed to produce round-off errors less than 0.5 for FFT lengths up to 30000K on exponents in the low 80M range. I am inexperienced in GPU programming and not familiar with numerical methods for FT so I gave up there; I don't know if what I did failed for reasons other than not having enough precision.
penlu is offline   Reply With Quote
Old 2018-12-23, 00:49   #14
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

101011001111002 Posts
Default

Quote:
Originally Posted by penlu View Post
I don't know if what I did failed for reasons other than not having enough precision.
That's cool. All (or, at least, almost all) attempts are valuable.

Quite possibly this simple question from a new participant might result in an optimization. It's happened before....
chalsall is offline   Reply With Quote
Old 2018-12-23, 03:35   #15
Neutron3529
 
Neutron3529's Avatar
 
Dec 2018
China

2B16 Posts
Default

Quote:
Originally Posted by chalsall View Post
Short answer: Yes, it is possible.

Longer answer: It doesn't make sense. Many more OPs and/or memory needed.



We have some /very/ good GPU programmers here, writing code optimally for this very rarefied problem space.

If it was possible to improve the throughput using SP, they would have done it.
Thank you for replying me.
Although using single precision will need more OPs and/or memory, the GPU I use actually have 6GB memory and 32:1 SP:DP speed, seems enough for a single-precision LL-test.
That's why I post this thread.
Neutron3529 is offline   Reply With Quote
Old 2018-12-23, 03:43   #16
Neutron3529
 
Neutron3529's Avatar
 
Dec 2018
China

4310 Posts
Default

Quote:
Originally Posted by kriesel View Post
Welcome.
Your English seems nearly perfect.
Possibly you'll find the attachment in the second post of https://www.mersenneforum.org/showthread.php?t=23371 useful.
Your GTX1060 can run LL (CUDALucas), P-1 (CUDAPm1), or trial factoring (mfaktc). Its contribution would probably be maximized by running trial factoring.
Unfortunately there is no CUDA PRP code for mersenne hunting currently.
Thank you for replying me.
Actually I have already using mfaktc in the morning (since the fan is too noisy during the night) and get some good result:
Manual testing190817147F2018-12-22 10:020.3Factor: 3798244479194871411047 / TF: 71-72
The question is, I cannot rely on the result it make.
mfaktc only report two things: whether a number have a factor or not, if it report "yes", it is easy to check, but if it report "no", it just report "no", no more information will bring to me.
It is very hard to check whether the "NF" result is reliable, and I just want to generate a reliable (at least checkable) result.
Neutron3529 is offline   Reply With Quote
Old 2018-12-23, 04:20   #17
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

22×2,767 Posts
Default

Quote:
Originally Posted by Neutron3529 View Post
Although using single precision will need more OPs and/or memory, the GPU I use actually have 6GB memory and 32:1 SP:DP speed, seems enough for a single-precision LL-test.
Sigh...

Fortunately some really smart people have taken on your challenge, and we might find out soon if this is even possible, let alone worthwhile.

But, FYI, "seems enough" is rarely enough around these here parts....
chalsall is offline   Reply With Quote
Old 2018-12-23, 04:47   #18
axn
 
axn's Avatar
 
Jun 2003

22·32·151 Posts
Default

Quote:
Originally Posted by ewmayer View Post
Flipping things around and asking what SP FFT length is needed to handle current-wavefront exponents gives 24576K = 24M, slightly above 5x the DP FFT length. So being pessimistic, on hardware where there is, say, a 10x or more per-cycle difference between SP and DP throughput, SP could well be a win.
5x FFT size means 2.5x memory usage. There is serious indications on non-DP-crippled GPUs that LL tests are severely memory-bottlenecked. Increasing the memory usage by 2.5x would just exacerbate the situation, even if theoretically the GPU could otherwise finish the computation sequence faster. On the flip side, this means that smaller FFTS (say < 1M) might benefit more from this, which might be useful for things like LLR where a lot of projects are there (Top 5000 entry point is around 1.4mbits).

Last fiddled with by axn on 2018-12-23 at 04:48
axn is online now   Reply With Quote
Old 2018-12-23, 05:40   #19
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

60178 Posts
Default

Quote:
Originally Posted by Neutron3529 View Post
Thank you for replying me.
Actually I have already using mfaktc in the morning (since the fan is too noisy during the night) and get some good result:
Manual testing190817147F2018-12-22 10:020.3Factor: 3798244479194871411047 / TF: 71-72
The question is, I cannot rely on the result it make.
mfaktc only report two things: whether a number have a factor or not, if it report "yes", it is easy to check, but if it report "no", it just report "no", no more information will bring to me.
It is very hard to check whether the "NF" result is reliable, and I just want to generate a reliable (at least checkable) result.
Run the self tests if you are concerned.

You're right in that it's possible bad/overclocked hardware may have errors. The only way to tell is to look at how many factors you find over time, which should be a little more than 1 in 100 attempts.
Mark Rose is online now   Reply With Quote
Old 2018-12-23, 08:33   #20
Neutron3529
 
Neutron3529's Avatar
 
Dec 2018
China

4310 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
Run the self tests if you are concerned.

You're right in that it's possible bad/overclocked hardware may have errors. The only way to tell is to look at how many factors you find over time, which should be a little more than 1 in 100 attempts.
That may not true
I found 5 factors in 318 attemps, the secret to find a factor is to try the lower possible factor limit, for those "no factor from 2^64 to 2^75", it seems less likely yo have a factor in range 2^75 to 2^76
which implies, I may run the mfaktc for the whole day and no factor be found, which may be so disappointing that I really do not want to see
Neutron3529 is offline   Reply With Quote
Old 2018-12-23, 08:43   #21
Neutron3529
 
Neutron3529's Avatar
 
Dec 2018
China

43 Posts
Default

Quote:
Originally Posted by chalsall View Post
Sigh...

Fortunately some really smart people have taken on your challenge, and we might find out soon if this is even possible, let alone worthwhile.

But, FYI, "seems enough" is rarely enough around these here parts....
God bless the 32:1 SP:DP ratio works...


BTW, my poor English does not tell me how to represent the previous sentence correctly for a non-Christian (like me).


Firstly I choose "may", use "may the force be with you" to create a sentence like "May the 32:1 SP:DP ratio works."


But I really doubt that if it will be confused with such sentence:
"May the 32:1 SP:DP ratio works?"
or
"Will the 32:1 SP:DP ratio works?"
Such sentences is more likely to become a question, not a "may" in "may the force be with you"


I want to know that, will English native speaker be confused with the sentences "May the 32:1 SP:DP ratio works" ?
And what's more, how to represent the previous sentence correctly for a non-Christian (like me).
Neutron3529 is offline   Reply With Quote
Old 2018-12-23, 09:40   #22
Nick
 
Nick's Avatar
 
Dec 2012
The Netherlands

70916 Posts
Default

Quote:
Originally Posted by Neutron3529 View Post
I want to know that, will English native speaker be confused with the sentences "May the 32:1 SP:DP ratio works" ?
And what's more, how to represent the previous sentence correctly for a non-Christian (like me).
I would say "May the 32:1 SP:DP ratio work!"
or "I hope the 32:1 SP:DP ratio works!".
Nick is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
does half-precision have any use for GIMPS? ixfd64 GPU Computing 9 2017-08-05 22:12
translating double to single precision? ixfd64 Hardware 5 2012-09-12 05:10
so what GIMPS work can single precision do? ixfd64 Hardware 21 2007-10-16 03:32
New program to test a single factor dsouza123 Programming 6 2004-01-13 03:53
4 checkins in a single calendar month from a single computer Gary Edstrom Lounge 7 2003-01-13 22:35

All times are UTC. The time now is 07:41.


Mon Jan 30 07:41:02 UTC 2023 up 165 days, 5:09, 0 users, load averages: 1.20, 1.16, 1.07

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔