![]() |
|
|
#12 |
|
∂2ω=0
Sep 2002
República de California
101101111011002 Posts |
You might simply first fill each thread's mem-address buffer with a large number of random addresses [properly constrained to lie within the proper mem-chunk and 32-bit aligned, obviously], to obviate the what-is-the-optimal-buffer-size-before-doing-batch-of-reads optimization issue. Definitely curious to see your resulting numbers, in any event.
|
|
|
|
|
|
#13 | |
|
"David"
Jul 2015
Ohio
51710 Posts |
Quote:
Running four threads on the two core + HT system, I was able to achieve a peak equivalent random read and sum rate of 97937 MB/s (up from 11,000 MB/s threaded random!) . This test used a bin granularity of 4096KB with 128 slots in each bin. My next step is to iterate over the bin/slot variations. So far: (unique index bins are per thread) 1024KB pages with 128 slots (1GB index): 51590 MB/s 512KB pages with 64 slots (1GB index): 74669 MB/s 256KB (L2 cache) pages with 32 slots (1GB index): 88318 MB/s // Max slots I could go without swapping on 16GB system 227KB (L2 cache with space) pages with 32 slots (1GB index): 81230 MB/s 4096KB pages with 128 slots (200MB index): 97937 MB/s - L3 Cache? 2730KB pages with 128 slots (393MB index): 97319 MB/s This isn't an instant win, the current code to build the index is quite slow and the index is quite large, but it is a promising approach. Code:
FillBuffer took 1.54 seconds (2047.20 MB @ 1333.0938 MB/s) Buffer Full Sleeping 60 to sync threads Formed 393216 KB buffer of 786432 bins, each addressing 2730 KB Filling random queue with 97517568 reads Formed 393216 KB buffer of 786432 bins, each addressing 2730 KB Filling random queue with 97517568 reads Formed 393216 KB buffer of 786432 bins, each addressing 2730 KB Filling random queue with 97517568 reads Formed 393216 KB buffer of 786432 bins, each addressing 2730 KB Filling random queue with 97517568 reads FillList took 0.49 seconds (371.85 MB @ 754.1983 MB/s) FillList took 0.49 seconds (371.85 MB @ 753.8254 MB/s) FillList took 0.49 seconds (371.85 MB @ 758.1578 MB/s) FillList took 0.50 seconds (371.85 MB @ 744.0179 MB/s) BuildBins took 6.82 seconds (371.85 MB @ 54.5591 MB/s) Full bins: 2554092, (97.38 % located) BuildBins took 6.84 seconds (371.85 MB @ 54.3830 MB/s) Full bins: 2554092, (97.38 % located) BuildBins took 7.00 seconds (371.85 MB @ 53.1307 MB/s) Full bins: 2554092, (97.38 % located) BuildBins took 7.08 seconds (371.85 MB @ 52.5223 MB/s) Full bins: 2554092, (97.38 % located) ReadRandom took 6.48 seconds (11587.69 MB @ 1788.1705 MB/s) 16a3f82c13aeadf9 Pausing... ReadRandom took 6.47 seconds (11587.69 MB @ 1791.2486 MB/s) 16a3f82c13aeadf9 Pausing... ReadRandom took 6.39 seconds (11587.69 MB @ 1814.8326 MB/s) 16a3f82c13aeadf9 Pausing... ReadRandom took 6.40 seconds (11587.69 MB @ 1810.7221 MB/s) 16a3f82c13aeadf9 Pausing... Running read test ReadBins took 1.28 seconds (11587.69 MB @ 9041.5889 MB/s) 16d7bd1879d8eb7d ReadBins took 1.29 seconds (11587.69 MB @ 9017.3944 MB/s) 16d7bd1879d8eb7d ReadBins took 1.30 seconds (11587.69 MB @ 8900.2444 MB/s) ReadBins took 1.30 seconds (11587.69 MB @ 8899.9301 MB/s) 16d7bd1879d8eb7d 16d7bd1879d8eb7d ReadTest took 1.30 seconds (126926.42 MB @ 97319.0807 MB/s) Last fiddled with by airsquirrels on 2017-06-22 at 14:48 |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Hyperthreading broken in Skylake and Kaby Lake? | GP2 | Hardware | 4 | 2017-06-26 02:08 |
| Kaby Lake / Asrock disappointment, RAM weirdness | Prime95 | Hardware | 17 | 2017-01-27 21:09 |
| Kaby Lake processors: bor-ing ! | tServo | Hardware | 11 | 2016-12-18 10:32 |
| Kaby Lake chip | Prime95 | Hardware | 0 | 2016-10-26 23:23 |
| 3LP sieving: memory and speed savings! | FactorEyes | Factoring | 36 | 2010-10-04 20:29 |