![]() |
|
|
#23 | |
|
Undefined
"The unspeakable one"
Jun 2006
My evil lair
6,793 Posts |
Quote:
|
|
|
|
|
|
|
#24 |
|
Jan 2008
France
3×199 Posts |
Well yes, but at least I did post some info about ARM servers. Conversation closed.
|
|
|
|
|
|
#25 |
|
Jan 2008
France
3·199 Posts |
TheNextPlatform:Arm’s Chances In Servers May Hinge On Success In HPC
|
|
|
|
|
|
#26 |
|
Jan 2008
France
3·199 Posts |
Another article from TheNextPlatform: Growing Up In An HPC World.
|
|
|
|
|
|
#27 |
|
Jan 2008
France
25516 Posts |
Slides about upcoming Fujitsu chip with SVE: http://www.ssken.gr.jp/MAINSITE/even...PCF_shinjo.pdf
It's in Japanese but with enough English to get a few things. Impressive beast. |
|
|
|
|
|
#28 | |
|
∂2ω=0
Sep 2002
República de California
267548 Posts |
Quote:
At some point I'll surely need to do both 256-bit and 512-bit ARMv8 coding, but said SIMD widths would need to appear in volume in some consumer-market form (e.g. smartphones) to make the effort worthwhile. |
|
|
|
|
|
|
#29 | ||||
|
Jan 2008
France
59710 Posts |
Quote:
Quote:
Quote:
Quote:
I think this document explains things nicely.But that's perhaps me being naive (and underestimating your knowledge of SVE, sorry), and you still might care a lot due to FFT structure. At the very least, the instructions would be the same (contrary to AVX2 vs AVX-512). I know you don't have a lot of free time, but if you ever want to start playing with SVE, you can pick a recent ARM cross-compiler and QEMU emulator (the SVE support in QEMU is validated). |
||||
|
|
|
|
|
#30 | |||
|
∂2ω=0
Sep 2002
República de California
2DEC16 Posts |
Quote:
Quote:
1. Literal byte address offsets in asm instructions - this could surely be parameterized, say via a literal byte argument to the asm macros whose value is set at build time; 2. The FFT data are arranged in memory is a SIMD-width-dependent fashion, e.g. for 256-bit SIMD we use quartets of doubles, whereby 4 complex [re,im] double-pairs are stored as [0.re,1.re,2.re,3.re],[0.im,1.im,2.im,3.im]. In a typical FFT step we butterfly several such data segments from disjoint (wide-stide-separated) portions of the big data array, call them segments A,B,C,D,... . There are 2 points in each FFT-convolution step - one bracketing the dyadic-mul step beween the forward and inverse FFT, and another bracketing the round-and-carry step - where we need to transpose such data, e.g. in our 256-bit-double-quartets example, we need to take [A0.re,A1.re,A2.re,A3.re] [B0.re,B1.re,B2.re,B3.re] [C0.re,C1.re,C2.re,C3.re] [D0.re,D1.re,D2.re,D3.re] and transpose those (and similarly for the im-parts) to [A0.re,B0.re,C0.re,D0.re] [A1.re,B1.re,C1.re,D1.re] [A2.re,B2.re,C2.re,D2.re] [A3.re,B3.re,C3.re,D3.re] Such transposes unavoidably involve data-width-dependent shuffle/permute instructions. In an SVE-styke paradigm, the way to handle this would seem to break the transpose work out of the macros where it currently is combined with other non-transpose operations, thus minimizing the amount of asm code with data-width-dependent instructions. Quote:
|
|||
|
|
|
|
|
#31 |
|
Jan 2008
France
3·199 Posts |
Thanks for the explanation!
I took a look at SVE support in gcc. Right now the intrinsics are not there, they're coming in version 10. I'm not sure if you need them. |
|
|
|
|
|
#32 |
|
Jan 2008
France
3·199 Posts |
Fujitsu A64fx system (with SVE) tops Green 500 and ranks 159 on Top 500. It achieves 85% of the theoretical peak TFLOPS on LINPACK.
Last fiddled with by ldesnogu on 2019-11-19 at 07:54 |
|
|
|
|
|
#33 |
|
∂2ω=0
Sep 2002
República de California
267548 Posts |
I don't use intrinsics myself, prefer to work 'close to the metal'. So the thing I need is inline-asm support from the compiler/assembler.
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| OFFICIAL "SERVER PROBLEMS" THREAD | ewmayer | PrimeNet | 2596 | 2023-07-06 19:09 |
| Primenet Server - Official Maintenance Thread | Madpoo | PrimeNet | 120 | 2023-06-12 15:12 |
| Official AVX-512 programming thread | ewmayer | Programming | 31 | 2016-10-14 05:49 |
| Official 'Let's move the hyphen!' thread. | Flatlander | Lounge | 29 | 2013-01-12 19:29 |
| Official Odd Perfect Number thread | ewmayer | Math | 14 | 2008-10-23 13:43 |