mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2019-01-17, 12:05   #23
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

22×3×112 Posts
Default

Quote:
Originally Posted by SELROC View Post
I am procrastinating the buy a new more powerful gpu, do you have any plans to optimize gpuowl for large numbers ?
I don't have any clear optimization ideas at this stage. (aside from going down the hand-assembly path, which is not realistic for me because it's a lot of work)

What large numbers do you have in mind? Do you think of some specific optimizations?
preda is offline   Reply With Quote
Old 2019-01-17, 12:13   #24
SELROC
 

100010011002 Posts
Default

Quote:
Originally Posted by preda View Post
I don't have any clear optimization ideas at this stage. (aside from going down the hand-assembly path, which is not realistic for me because it's a lot of work)

What large numbers do you have in mind? Do you think of some specific optimizations?

The 300M to 500M exponents.


A 332M exponent took 2 months of gpu work on the RX580.
https://www.mersenne.org/report_expo...2412937&full=1


as a side note: it seems it is now assigned to someone else.
  Reply With Quote
Old 2019-01-17, 19:57   #25
nomead
 
nomead's Avatar
 
"Sam Laur"
Dec 2018
Turku, Finland

1001111012 Posts
Default

https://www.pcgamer.com/amd-scoffs-a...short-supply//

It's truly silly season now...
nomead is offline   Reply With Quote
Old 2019-01-18, 16:08   #26
SELROC
 

C4516 Posts
Default

Quote:
Originally Posted by M344587487 View Post
Quote:
Originally Posted by preda View Post
2x would be amazing. In practice I would be very happy if I see a 50% speedup.

About memory, it is my impression that the latency did not improve much, but the bandwidth doubled. But to take advantage of this, better occupancy would be required (double the number of memory operations in flight), and this is not easily achievable because of other limiting resources: LDS memory and nb. of registers (VGPRs) that remain unchanged I guess.

About compute, the parts that aren't DP (e.g. pointer arithmetic, other integer e.g. carry, logic) remain unchanged, and this will reduce the observed speedup.

IMO another limiting factor for GCN performance is still the compiler, after so many years: the compiler does a rather poor job at generating highly efficient code (not an easy task I agree).

OTOH the better cooling will help, and allow the card to be higher clocked without thermal throttling (which is a problem on Vega64 blower cooler)

https://www.phoronix.com/scan.php?pa...se-AMD-GPU-TDP
  Reply With Quote
Old 2019-01-23, 03:03   #27
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013
https://pedan.tech/

24·199 Posts
Default

.

Last fiddled with by Mark Rose on 2019-01-23 at 03:04
Mark Rose is offline   Reply With Quote
Old 2019-02-07, 14:28   #28
mackerel
 
mackerel's Avatar
 
Feb 2016
UK

26×7 Posts
Default

Haven't had time to dig yet, saw at Anandtech that AMD changed their minds yet again and FP64 is now 1/4 rate.

https://www.anandtech.com/show/13923...eon-vii-review
mackerel is offline   Reply With Quote
Old 2019-02-07, 14:32   #29
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013
https://pedan.tech/

24×199 Posts
Default

I just came here to post that

Quote:
The Radeon VII graphics card was created for gamers and creators, enthusiasts and early adopters. Given the broader market Radeon VII is targeting, we were considering different levels of FP64 performance. We previously communicated that Radeon VII provides 0.88 TFLOPS (DP=1/16 SP). However based on customer interest and feedback we wanted to let you know that we have decided to increase double precision compute performance to 3.52 TFLOPS (DP=1/4SP).
Mark Rose is offline   Reply With Quote
Old 2019-02-07, 15:13   #30
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

3B616 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
I just came here to post that
Just to make sure all the bases are covered this guy says that FP64 is ~1.7 aka DP=1/8 SP in his gaming review although he's probably parroting old information: https://www.youtube.com/watch?v=6jP3tetYnVI


I was expecting £700 but it's £650 in the UK. Tried to buy one but my bank decided I was trying to steal from myself and now they're out of stock so that's nice.
M344587487 is offline   Reply With Quote
Old 2019-02-07, 15:31   #31
mackerel
 
mackerel's Avatar
 
Feb 2016
UK

44810 Posts
Default

I saw it in stock at Scan at £650, but as I was looking at other stores it sold out. Some places are indicating stock arriving tomorrow so assume shipments are ongoing. OCUK have some in stock at £800. I don't want it enough to pay £150 premium for what looks identical to the £650 ones.

This could be the GPU to make the largest known prime not be a mersenne. Over at PrimeGrid they just started "do you feel lucky" project which are GFN22 at a high enough level to exceed largest known prime. Fastest GPUs so far are doing about one a day and the code is FP64. If this card could do several units a day, that would help a lot.

Edit: forget that last part. Just been pointed out to me that specific project can't use FP64. Regular GFN21/22 could still see a significant benefit.

Last fiddled with by mackerel on 2019-02-07 at 16:14 Reason: updated info
mackerel is offline   Reply With Quote
Old 2019-02-07, 16:14   #32
nomead
 
nomead's Avatar
 
"Sam Laur"
Dec 2018
Turku, Finland

317 Posts
Default

Interesting. How late in the product cycle can they make these decisions on how much FP64 to include? Is it configuration fuses on the die, microcode update, driver limitation, or what? And of course... could it be hacked afterwards
nomead is offline   Reply With Quote
Old 2019-02-07, 16:26   #33
tServo
 
tServo's Avatar
 
"Marv"
May 2009
near the Tannhäuser Gate

3×269 Posts
Default

I'll buy 1 ( eventually ) to test but 3 thoughts:

(1) The board is going to be difficult to "live with":
High power requirements and blasts lots of heat IN THE CASE. Reviewers have noted its fans are obnoxiously loud.

(2) It's impossible to get, of course. It will be interesting to see how soon AMD can alleviate this situation. Is this a result of poor 7nm yields?

(3) With such impressive specs, I would have thought that it would absolutely CRUSH other boards in toe-to-toe comparison tests. It wins a lot, but not as many and not by as much.
tServo is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Vega 20 announced with 7.64 TFlops of FP64 M344587487 GPU Computing 4 2018-11-08 16:56
GTX 1180 Mars Volta consumer card specs leaked tServo GPU Computing 20 2018-06-24 08:04
RX Vega performance xx005fs GPU Computing 5 2018-01-17 00:22
Radeon Pro Duo 0PolarBearsHere GPU Computing 0 2016-03-15 01:32
AMD Radeon R9 295X2 firejuggler GPU Computing 33 2014-09-03 21:42

All times are UTC. The time now is 14:49.


Fri Jul 7 14:49:01 UTC 2023 up 323 days, 12:17, 0 users, load averages: 0.97, 1.28, 1.17

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔