20170314, 00:18  #133  
"Kieren"
Jul 2011
In My Own Galaxy!
2·3·1,693 Posts 
Quote:


20170314, 01:33  #134  
P90 years forever!
Aug 2002
Yeehaw, FL
1E08_{16} Posts 
Quote:
BTW, when using the Affinity= option, the logical core # using the numbering scheme returned by hwloc (which you can see in results.txt by doing a benchmark)  it dows not use the numbering scheme the OS uses. Quote:
Last fiddled with by Prime95 on 20170314 at 03:01 

20170314, 02:24  #135 
"Kieren"
Jul 2011
In My Own Galaxy!
2×3×1,693 Posts 
Thanks for all the information, and especially for the caution about the numbering scheme. I will let you know results when I get the new version running.
EDIT: Sorry that this is so far OT. Stable affinity is something I've not achieved on my own. I would welcome a move to a more general Prime95 thread. However, here is the hwloc output: Code:
Machine topology as determined by hwloc library: Machine#0 (total=13181240KB, Backend=Windows, hwlocVersion=1.11.6, ProcessName=prime95.exe) NUMANode#0 (local=13181240KB, total=13181240KB) Package#0 (CPUVendor=GenuineIntel, CPUFamilyNumber=6, CPUModelNumber=94, CPUModel="Intel(R) Core(TM) i76700K CPU @ 4.00GHz", CPUStepping=3) L3 (size=8192KB, linesize=64, ways=16, Inclusive=1) L2 (size=256KB, linesize=64, ways=4, Inclusive=0) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000003) PU#0 (cpuset: 0x00000001) PU#1 (cpuset: 0x00000002) L2 (size=256KB, linesize=64, ways=4, Inclusive=0) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x0000000c) PU#2 (cpuset: 0x00000004) PU#3 (cpuset: 0x00000008) L2 (size=256KB, linesize=64, ways=4, Inclusive=0) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000030) PU#4 (cpuset: 0x00000010) PU#5 (cpuset: 0x00000020) L2 (size=256KB, linesize=64, ways=4, Inclusive=0) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x000000c0) PU#6 (cpuset: 0x00000040) PU#7 (cpuset: 0x00000080) Thanks again. I will report more after some observation. Last fiddled with by kladner on 20170314 at 02:55 
20170314, 04:20  #136 
"Kieren"
Jul 2011
In My Own Galaxy!
2×3×1,693 Posts 
29.1 absolutely nails the affinity without additional local.txt entries.
Running below 2.2 ms/it is as good as I've gotten under any previous manually set operation. 
20170316, 10:06  #137 
Oct 2008
n00bville
2D8_{16} Posts 
I am wondering: How is the performance of the Ryzen 8 Core in comparison to the Intel processors? I haven't seen performance comparisons so far. For games there need to be some performance optimizations. For Prime95 it will be the same I presume?

20170316, 12:25  #138 
Feb 2016
UK
110111000_{2} Posts 
Based on my testing so far with 1700, IPC in FMA3 is roughly 1/2 of a modern Intel. I don't know if optimisations will change that significantly. If you want to do that kinda thing, stick with Intel for now. At a minimum, the Ryzen platform as a whole needs to mature, and it will probably help once software starts to explicitly optimise for it.

20170319, 20:09  #139 
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
2·2,969 Posts 
Agner Fog hasn't managed to get his hands on a Ryzen cpu yet to update his instruction tables.
If anyone has Linux running on their Ryzen cpu and is willing to let him remote control for a while please email him. http://www.agner.org/contact/?e=0#0 This could give us vital clues for helping Prime95 performance catch up to Intel on Ryzen cpus. 
20170324, 18:10  #140 
"/X\(‘‘)/X\"
Jan 2013
37×79 Posts 
Apparently Ryzen can do up to DDR42666 with quad stick, dual rank memory:
http://www.legitreviews.com/amdryze...ormance_192960 
20170401, 00:13  #141  
Jan 2013
2^{2}×17 Posts 
Quote:
https://www.bittech.net/news/hardwa...updateapril/1 "The update will be followed by AGESA 1.0.0.5 in May, Hallock continued, featuring improvements for overclocking DDR4 memory." from the article linked. Anyone else heard any B2 stepping rumors or facts? 

20170410, 17:54  #142 
Feb 2016
UK
2^{3}·5·11 Posts 
Just noticed a bios update was made available for my mobo which includes AGESA 1.0.0.4. One of the fixes in that is resolving the FMA3 bug where the system locks up or reboots with a particular sequence of instructions. I've not encountered it with Prime95, but have replicated the original report with other software. Of other interest is a claimed reduction in memory latency. This does not appear to be the update to enable faster ram support, presumably which will come later still.
Last fiddled with by mackerel on 20170410 at 17:55 
20170411, 07:58  #143 
Jan 2003
7×29 Posts 
Ryzen 1700 benchmark results
Below are the results from my Ryzen 1700 (nonX) with all cores set at 3.32GHz (stock rating 3GHz / Turbo 3.7GHz). Memory is running at 2933GHz CAS16, using AGESA 1.0.0.4a, which is the version with the 6ns latency improvement and the latest available at this time. Operating system is Windows 10 x64. I rearranged the benchmark results below for a bit easier reading / comparison.
AMD Ryzen 7 1700 EightCore Processor CPU speed: 3318.72 MHz, 8 hyperthreaded cores CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA L1 cache size: 32 KB L2 cache size: 512 KB, L3 cache size: 16 MB L1 cache line size: 64 bytes L2 cache line size: 64 bytes L1 TLBS: 64 L2 TLBS: 1536 Prime95 64bit version 29.1, RdtscTiming=1 Timings for 1024K FFT length (1 cpu, 1 worker): 7.83 ms. Throughput: 127.69 iter/sec. Timings for 1280K FFT length (1 cpu, 1 worker): 9.88 ms. Throughput: 101.17 iter/sec. Timings for 1536K FFT length (1 cpu, 1 worker): 11.97 ms. Throughput: 83.57 iter/sec. Timings for 1792K FFT length (1 cpu, 1 worker): 14.58 ms. Throughput: 68.60 iter/sec. Timings for 2048K FFT length (1 cpu, 1 worker): 16.05 ms. Throughput: 62.29 iter/sec. Timings for 2560K FFT length (1 cpu, 1 worker): 20.60 ms. Throughput: 48.55 iter/sec. Timings for 3072K FFT length (1 cpu, 1 worker): 24.87 ms. Throughput: 40.20 iter/sec. Timings for 3584K FFT length (1 cpu, 1 worker): 29.90 ms. Throughput: 33.44 iter/sec. Timings for 4096K FFT length (1 cpu, 1 worker): 34.18 ms. Throughput: 29.26 iter/sec. Timings for 5120K FFT length (1 cpu, 1 worker): 42.60 ms. Throughput: 23.48 iter/sec. Timings for 6144K FFT length (1 cpu, 1 worker): 50.67 ms. Throughput: 19.74 iter/sec. Timings for 7168K FFT length (1 cpu, 1 worker): 60.12 ms. Throughput: 16.63 iter/sec. Timings for 8192K FFT length (1 cpu, 1 worker): 68.76 ms. Throughput: 14.54 iter/sec. Timings for 1024K FFT length (8 cpus, 1 worker): 1.13 ms. Throughput: 886.42 iter/sec. Timings for 1280K FFT length (8 cpus, 1 worker): 1.42 ms. Throughput: 704.55 iter/sec. Timings for 1536K FFT length (8 cpus, 1 worker): 1.71 ms. Throughput: 584.87 iter/sec. Timings for 1792K FFT length (8 cpus, 1 worker): 2.10 ms. Throughput: 475.44 iter/sec. Timings for 2048K FFT length (8 cpus, 1 worker): 2.39 ms. Throughput: 418.60 iter/sec. Timings for 2560K FFT length (8 cpus, 1 worker): 3.96 ms. Throughput: 252.38 iter/sec. Timings for 3072K FFT length (8 cpus, 1 worker): 4.97 ms. Throughput: 201.08 iter/sec. Timings for 3584K FFT length (8 cpus, 1 worker): 5.97 ms. Throughput: 167.51 iter/sec. Timings for 4096K FFT length (8 cpus, 1 worker): 6.92 ms. Throughput: 144.58 iter/sec. Timings for 5120K FFT length (8 cpus, 1 worker): 7.32 ms. Throughput: 136.59 iter/sec. Timings for 6144K FFT length (8 cpus, 1 worker): 9.37 ms. Throughput: 106.71 iter/sec. Timings for 7168K FFT length (8 cpus, 1 worker): 10.96 ms. Throughput: 91.21 iter/sec. Timings for 8192K FFT length (8 cpus, 1 worker): 12.69 ms. Throughput: 78.83 iter/sec. Timings for 1024K FFT length (8 cpus, 8 workers): 11.30, 11.41, 11.28, 11.22, 11.18, 11.18, 11.21, 11.20 ms. Throughput: 711.26 iter/sec. Timings for 1280K FFT length (8 cpus, 8 workers): 14.15, 14.51, 14.13, 14.15, 14.03, 14.05, 14.13, 14.16 ms. Throughput: 564.84 iter/sec. Timings for 1536K FFT length (8 cpus, 8 workers): 16.81, 17.45, 16.96, 17.00, 16.84, 16.82, 16.91, 16.82 ms. Throughput: 472.01 iter/sec. Timings for 1792K FFT length (8 cpus, 8 workers): 20.85, 21.81, 20.92, 21.12, 20.68, 20.92, 21.25, 20.77 ms. Throughput: 380.31 iter/sec. Timings for 2048K FFT length (8 cpus, 8 workers): 22.60, 23.32, 22.76, 22.78, 22.54, 22.61, 22.61, 22.54 ms. Throughput: 352.17 iter/sec. Timings for 2560K FFT length (8 cpus, 8 workers): 33.53, 34.97, 33.76, 34.34, 34.01, 33.93, 34.26, 33.98 ms. Throughput: 234.66 iter/sec. Timings for 3072K FFT length (8 cpus, 8 workers): 41.23, 42.38, 41.51, 40.71, 40.84, 40.78, 40.87, 41.04 ms. Throughput: 194.34 iter/sec. Timings for 3584K FFT length (8 cpus, 8 workers): 48.09, 49.43, 47.96, 48.77, 47.89, 47.32, 47.90, 47.23 ms. Throughput: 166.45 iter/sec. Timings for 4096K FFT length (8 cpus, 8 workers): 56.27, 57.15, 55.09, 55.39, 55.64, 54.99, 54.88, 54.69 ms. Throughput: 144.14 iter/sec. Timings for 5120K FFT length (8 cpus, 8 workers): 58.15, 60.30, 58.03, 57.82, 57.55, 57.00, 58.24, 57.01 ms. Throughput: 137.94 iter/sec. Timings for 6144K FFT length (8 cpus, 8 workers): 70.59, 72.77, 71.30, 71.76, 70.77, 70.67, 70.83, 70.63 ms. Throughput: 112.43 iter/sec. Timings for 7168K FFT length (8 cpus, 8 workers): 87.46, 87.18, 83.29, 83.81, 82.80, 83.61, 83.66, 83.11 ms. Throughput: 94.87 iter/sec. Timings for 8192K FFT length (8 cpus, 8 workers): 99.83, 99.12, 96.13, 97.41, 96.20, 96.03, 96.76, 96.01 ms. Throughput: 82.33 iter/sec. 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Intel Processor Speculations  Mark Rose  Hardware  109  20171013 16:55 
Cannonlake speculations  henryzz  Hardware  0  20170303 19:49 