![]() |
|
|
#133 | |
|
"Kieren"
Jul 2011
In My Own Galaxy!
2×3×1,693 Posts |
Quote:
|
|
|
|
|
|
|
#134 | ||
|
P90 years forever!
Aug 2002
Yeehaw, FL
17·487 Posts |
Quote:
BTW, when using the Affinity= option, the logical core # using the numbering scheme returned by hwloc (which you can see in results.txt by doing a benchmark) -- it dows not use the numbering scheme the OS uses. Quote:
Last fiddled with by Prime95 on 2017-03-14 at 03:01 |
||
|
|
|
|
|
#135 |
|
"Kieren"
Jul 2011
In My Own Galaxy!
100111101011102 Posts |
Thanks for all the information, and especially for the caution about the numbering scheme. I will let you know results when I get the new version running.
EDIT: Sorry that this is so far OT. Stable affinity is something I've not achieved on my own. I would welcome a move to a more general Prime95 thread. However, here is the hwloc output: Code:
Machine topology as determined by hwloc library:
Machine#0 (total=13181240KB, Backend=Windows, hwlocVersion=1.11.6, ProcessName=prime95.exe)
NUMANode#0 (local=13181240KB, total=13181240KB)
Package#0 (CPUVendor=GenuineIntel, CPUFamilyNumber=6, CPUModelNumber=94, CPUModel="Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz", CPUStepping=3)
L3 (size=8192KB, linesize=64, ways=16, Inclusive=1)
L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
Core (cpuset: 0x00000003)
PU#0 (cpuset: 0x00000001)
PU#1 (cpuset: 0x00000002)
L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
Core (cpuset: 0x0000000c)
PU#2 (cpuset: 0x00000004)
PU#3 (cpuset: 0x00000008)
L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
Core (cpuset: 0x00000030)
PU#4 (cpuset: 0x00000010)
PU#5 (cpuset: 0x00000020)
L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
Core (cpuset: 0x000000c0)
PU#6 (cpuset: 0x00000040)
PU#7 (cpuset: 0x00000080)
Thanks again. I will report more after some observation. Last fiddled with by kladner on 2017-03-14 at 02:55 |
|
|
|
|
|
#136 |
|
"Kieren"
Jul 2011
In My Own Galaxy!
2·3·1,693 Posts |
29.1 absolutely nails the affinity without additional local.txt entries.
![]() Running below 2.2 ms/it is as good as I've gotten under any previous manually set operation. |
|
|
|
|
|
#137 |
|
Oct 2008
n00bville
13408 Posts |
I am wondering: How is the performance of the Ryzen 8 Core in comparison to the Intel processors? I haven't seen performance comparisons so far. For games there need to be some performance optimizations. For Prime95 it will be the same I presume?
|
|
|
|
|
|
#138 |
|
Feb 2016
UK
7008 Posts |
Based on my testing so far with 1700, IPC in FMA3 is roughly 1/2 of a modern Intel. I don't know if optimisations will change that significantly. If you want to do that kinda thing, stick with Intel for now. At a minimum, the Ryzen platform as a whole needs to mature, and it will probably help once software starts to explicitly optimise for it.
|
|
|
|
|
|
#139 |
|
Just call me Henry
"David"
Sep 2007
Liverpool (GMT/BST)
3·23·89 Posts |
Agner Fog hasn't managed to get his hands on a Ryzen cpu yet to update his instruction tables.
If anyone has Linux running on their Ryzen cpu and is willing to let him remote control for a while please email him. http://www.agner.org/contact/?e=0#0 This could give us vital clues for helping Prime95 performance catch up to Intel on Ryzen cpus. |
|
|
|
|
|
#140 |
|
"/X\(‘-‘)/X\"
Jan 2013
https://pedan.tech/
61608 Posts |
Apparently Ryzen can do up to DDR4-2666 with quad stick, dual rank memory:
http://www.legitreviews.com/amd-ryze...ormance_192960 |
|
|
|
|
|
#141 | |
|
Jan 2013
6810 Posts |
Quote:
https://www.bit-tech.net/news/hardwa...update-april/1 "The update will be followed by AGESA 1.0.0.5 in May, Hallock continued, featuring improvements for overclocking DDR4 memory." from the article linked. Anyone else heard any B2 stepping rumors or facts? |
|
|
|
|
|
|
#142 |
|
Feb 2016
UK
7008 Posts |
Just noticed a bios update was made available for my mobo which includes AGESA 1.0.0.4. One of the fixes in that is resolving the FMA3 bug where the system locks up or reboots with a particular sequence of instructions. I've not encountered it with Prime95, but have replicated the original report with other software. Of other interest is a claimed reduction in memory latency. This does not appear to be the update to enable faster ram support, presumably which will come later still.
Last fiddled with by mackerel on 2017-04-10 at 17:55 |
|
|
|
|
|
#143 |
|
Jan 2003
110011102 Posts |
Below are the results from my Ryzen 1700 (non-X) with all cores set at 3.32GHz (stock rating 3GHz / Turbo 3.7GHz). Memory is running at 2933GHz CAS16, using AGESA 1.0.0.4a, which is the version with the 6ns latency improvement and the latest available at this time. Operating system is Windows 10 x64. I rearranged the benchmark results below for a bit easier reading / comparison.
AMD Ryzen 7 1700 Eight-Core Processor CPU speed: 3318.72 MHz, 8 hyperthreaded cores CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA L1 cache size: 32 KB L2 cache size: 512 KB, L3 cache size: 16 MB L1 cache line size: 64 bytes L2 cache line size: 64 bytes L1 TLBS: 64 L2 TLBS: 1536 Prime95 64-bit version 29.1, RdtscTiming=1 Timings for 1024K FFT length (1 cpu, 1 worker): 7.83 ms. Throughput: 127.69 iter/sec. Timings for 1280K FFT length (1 cpu, 1 worker): 9.88 ms. Throughput: 101.17 iter/sec. Timings for 1536K FFT length (1 cpu, 1 worker): 11.97 ms. Throughput: 83.57 iter/sec. Timings for 1792K FFT length (1 cpu, 1 worker): 14.58 ms. Throughput: 68.60 iter/sec. Timings for 2048K FFT length (1 cpu, 1 worker): 16.05 ms. Throughput: 62.29 iter/sec. Timings for 2560K FFT length (1 cpu, 1 worker): 20.60 ms. Throughput: 48.55 iter/sec. Timings for 3072K FFT length (1 cpu, 1 worker): 24.87 ms. Throughput: 40.20 iter/sec. Timings for 3584K FFT length (1 cpu, 1 worker): 29.90 ms. Throughput: 33.44 iter/sec. Timings for 4096K FFT length (1 cpu, 1 worker): 34.18 ms. Throughput: 29.26 iter/sec. Timings for 5120K FFT length (1 cpu, 1 worker): 42.60 ms. Throughput: 23.48 iter/sec. Timings for 6144K FFT length (1 cpu, 1 worker): 50.67 ms. Throughput: 19.74 iter/sec. Timings for 7168K FFT length (1 cpu, 1 worker): 60.12 ms. Throughput: 16.63 iter/sec. Timings for 8192K FFT length (1 cpu, 1 worker): 68.76 ms. Throughput: 14.54 iter/sec. Timings for 1024K FFT length (8 cpus, 1 worker): 1.13 ms. Throughput: 886.42 iter/sec. Timings for 1280K FFT length (8 cpus, 1 worker): 1.42 ms. Throughput: 704.55 iter/sec. Timings for 1536K FFT length (8 cpus, 1 worker): 1.71 ms. Throughput: 584.87 iter/sec. Timings for 1792K FFT length (8 cpus, 1 worker): 2.10 ms. Throughput: 475.44 iter/sec. Timings for 2048K FFT length (8 cpus, 1 worker): 2.39 ms. Throughput: 418.60 iter/sec. Timings for 2560K FFT length (8 cpus, 1 worker): 3.96 ms. Throughput: 252.38 iter/sec. Timings for 3072K FFT length (8 cpus, 1 worker): 4.97 ms. Throughput: 201.08 iter/sec. Timings for 3584K FFT length (8 cpus, 1 worker): 5.97 ms. Throughput: 167.51 iter/sec. Timings for 4096K FFT length (8 cpus, 1 worker): 6.92 ms. Throughput: 144.58 iter/sec. Timings for 5120K FFT length (8 cpus, 1 worker): 7.32 ms. Throughput: 136.59 iter/sec. Timings for 6144K FFT length (8 cpus, 1 worker): 9.37 ms. Throughput: 106.71 iter/sec. Timings for 7168K FFT length (8 cpus, 1 worker): 10.96 ms. Throughput: 91.21 iter/sec. Timings for 8192K FFT length (8 cpus, 1 worker): 12.69 ms. Throughput: 78.83 iter/sec. Timings for 1024K FFT length (8 cpus, 8 workers): 11.30, 11.41, 11.28, 11.22, 11.18, 11.18, 11.21, 11.20 ms. Throughput: 711.26 iter/sec. Timings for 1280K FFT length (8 cpus, 8 workers): 14.15, 14.51, 14.13, 14.15, 14.03, 14.05, 14.13, 14.16 ms. Throughput: 564.84 iter/sec. Timings for 1536K FFT length (8 cpus, 8 workers): 16.81, 17.45, 16.96, 17.00, 16.84, 16.82, 16.91, 16.82 ms. Throughput: 472.01 iter/sec. Timings for 1792K FFT length (8 cpus, 8 workers): 20.85, 21.81, 20.92, 21.12, 20.68, 20.92, 21.25, 20.77 ms. Throughput: 380.31 iter/sec. Timings for 2048K FFT length (8 cpus, 8 workers): 22.60, 23.32, 22.76, 22.78, 22.54, 22.61, 22.61, 22.54 ms. Throughput: 352.17 iter/sec. Timings for 2560K FFT length (8 cpus, 8 workers): 33.53, 34.97, 33.76, 34.34, 34.01, 33.93, 34.26, 33.98 ms. Throughput: 234.66 iter/sec. Timings for 3072K FFT length (8 cpus, 8 workers): 41.23, 42.38, 41.51, 40.71, 40.84, 40.78, 40.87, 41.04 ms. Throughput: 194.34 iter/sec. Timings for 3584K FFT length (8 cpus, 8 workers): 48.09, 49.43, 47.96, 48.77, 47.89, 47.32, 47.90, 47.23 ms. Throughput: 166.45 iter/sec. Timings for 4096K FFT length (8 cpus, 8 workers): 56.27, 57.15, 55.09, 55.39, 55.64, 54.99, 54.88, 54.69 ms. Throughput: 144.14 iter/sec. Timings for 5120K FFT length (8 cpus, 8 workers): 58.15, 60.30, 58.03, 57.82, 57.55, 57.00, 58.24, 57.01 ms. Throughput: 137.94 iter/sec. Timings for 6144K FFT length (8 cpus, 8 workers): 70.59, 72.77, 71.30, 71.76, 70.77, 70.67, 70.83, 70.63 ms. Throughput: 112.43 iter/sec. Timings for 7168K FFT length (8 cpus, 8 workers): 87.46, 87.18, 83.29, 83.81, 82.80, 83.61, 83.66, 83.11 ms. Throughput: 94.87 iter/sec. Timings for 8192K FFT length (8 cpus, 8 workers): 99.83, 99.12, 96.13, 97.41, 96.20, 96.03, 96.76, 96.01 ms. Throughput: 82.33 iter/sec. |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Intel Processor Speculations | Mark Rose | Hardware | 109 | 2017-10-13 16:55 |
| Cannonlake speculations | henryzz | Hardware | 0 | 2017-03-03 19:49 |