mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2003-12-07, 17:48   #1
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default Use of large memory pages possible with newer linux kernels

http://kerneltrap.org/node/view/418

That sounds very interesting! And it offers new possibilities for efficient FFT implementations.
Dresdenboy is offline   Reply With Quote
Old 2003-12-08, 12:53   #2
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

36110 Posts
Default

Ok, some more details about the use of these large pages:

Oracle managed to get a 8% speedup by using the large pages. Although I have little experience in this area I think for FFTs the speedup will be much larger, because:
  • Even if data is already in the L1 cache the accessing time can increase if the memory addresses of these data are actually spread over many memory pages.
  • The limited amount of TLB entries requires fine tuning of FFT algorithms to avoid TLB thrashing as much as possible - but this avoidance could cause less efficient algorithms.
  • why is it so hard for large size FFTs to come at least close to the FFT MFLOPS for FFTs running completely inside L1 (or L2) cache in times of memory prefetching?
  • I need at least 2 mem-read/write passes to do a large size FFT - but todays max transfer rates for P4/Opteron/AFX systems (6.4GB/s = reading up to 750 times the 1024k FFT data set per second) is hardly reachable because it drops significantly for large strides.
I roughly estimate that at least a speedup of 10-30% could be possible.

Older discussions regarding this topic can be found here:
http://www.mail-archive.com/mersenne.../msg07117.html
Dresdenboy is offline   Reply With Quote
Old 2003-12-08, 14:26   #3
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

1011010012 Posts
Default

Here is a paper which gives some numbers for achieved speedups by using "superpages": http://portal.acm.org/citation.cfm?i...l=GUIDE&dl=ACM

(I think the pdf can only be seen by people which have a subscription)

Here some results of memory intensive benchmarks with some impressive examples (together with the average speedups because in most cases the speed increase is <10%):

Code:
SPEC benchmark    Speedup by using superpages

vpr                      38.3%
mcf                      67.6%
vortex                   11.2%
bzip2                    14.0%
average for SPECint      11.2%

galgel                   28.9%
art                      12.2%
lucas                    28.0% :cool:
apsi                     82.7%
average for SPECfp       11.0%


and some non-SPEC benchmarks:
FFTW                     54.9% :grin:
Matrix                  654.6% :shock:
They implemented it in FreeBSD on the Alpha CPU.

I think these numbers tell enough to justify any GIMPS-related research in this area - especially as it is available with Linux kernel versions 2.5.36+ and 2.6. Actually (for first tests) it just need a change of the memory allocating and releasing system calls.


Last fiddled with by Dresdenboy on 2003-12-08 at 14:33
Dresdenboy is offline   Reply With Quote
Old 2003-12-08, 14:47   #4
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

16916 Posts
Default

Here the paper (and other stuff) is accessible for everyone:

http://www.cs.rice.edu/~jnavarro/superpages/
Dresdenboy is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Large memory usage Unregistered Information & Answers 8 2010-05-14 23:45
Large-capacity memory fivemack Hardware 17 2010-02-12 22:29
GMP-ECM on large memory systems ~ 60 Gb Tom GMP-ECM 20 2007-12-21 18:57
Large memory bugfix for mprime 24.14.2 S00113 Software 7 2006-03-24 02:56
Is there a large memory method? nibble4bits Miscellaneous Math 21 2005-11-11 12:57

All times are UTC. The time now is 13:37.


Sun Jun 4 13:37:58 UTC 2023 up 290 days, 11:06, 1 user, load averages: 1.44, 1.17, 0.99

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔