mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2017-03-14, 00:18   #133
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

26CB16 Posts
Default

Quote:
Where can I get the current version of 29.x?
Oops Is the a Win64 version available?
kladner is online now   Reply With Quote
Old 2017-03-14, 01:33   #134
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·5·19·37 Posts
Default

Quote:
Originally Posted by kladner View Post
Thanks for the clarification on syntax. Where can I get the current version of 29.x?.
Go to post #93 in this thread

BTW, when using the Affinity= option, the logical core # using the numbering scheme returned by hwloc (which you can see in results.txt by doing a benchmark) -- it dows not use the numbering scheme the OS uses.

Quote:
I was only avoiding CPU 0 because of the possible interrupt issues. As I write this, I realize that 0 and 1 are essentially the same. My only real concern is getting affinity locked to 0 or 1, 2 or 3, 4 or 5, and 6 or 7.
This should be prime95 29.1's default behavior without using the Affinity= setting. If this is not the default behavior, please let me know.

Last fiddled with by Prime95 on 2017-03-14 at 03:01
Prime95 is offline   Reply With Quote
Old 2017-03-14, 02:24   #135
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

26CB16 Posts
Default

Thanks for all the information, and especially for the caution about the numbering scheme. I will let you know results when I get the new version running.
EDIT:
Dropbox returns 404. CANCEL that. I went to the original post and the link worked.

Sorry that this is so far OT. Stable affinity is something I've not achieved on my own. I would welcome a move to a more general Prime95 thread. However, here is the hwloc output:
Code:
Machine topology as determined by hwloc library:
 Machine#0 (total=13181240KB, Backend=Windows, hwlocVersion=1.11.6, ProcessName=prime95.exe)
  NUMANode#0 (local=13181240KB, total=13181240KB)
    Package#0 (CPUVendor=GenuineIntel, CPUFamilyNumber=6, CPUModelNumber=94, CPUModel="Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz", CPUStepping=3)
      L3 (size=8192KB, linesize=64, ways=16, Inclusive=1)
        L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
          L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
            Core (cpuset: 0x00000003)
              PU#0 (cpuset: 0x00000001)
              PU#1 (cpuset: 0x00000002)
        L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
          L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
            Core (cpuset: 0x0000000c)
              PU#2 (cpuset: 0x00000004)
              PU#3 (cpuset: 0x00000008)
        L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
          L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
            Core (cpuset: 0x00000030)
              PU#4 (cpuset: 0x00000010)
              PU#5 (cpuset: 0x00000020)
        L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
          L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
            Core (cpuset: 0x000000c0)
              PU#6 (cpuset: 0x00000040)
              PU#7 (cpuset: 0x00000080)
Running. In Windows terms it has started on cores 0, 2, 4, 6.
Thanks again. I will report more after some observation.

Last fiddled with by kladner on 2017-03-14 at 02:55
kladner is online now   Reply With Quote
Old 2017-03-14, 04:20   #136
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

9,931 Posts
Default

29.1 absolutely nails the affinity without additional local.txt entries.
Running below 2.2 ms/it is as good as I've gotten under any previous manually set operation.
Attached Thumbnails
Click image for larger version

Name:	P95_291_cleanboot.PNG
Views:	83
Size:	69.1 KB
ID:	15753  
kladner is online now   Reply With Quote
Old 2017-03-16, 10:06   #137
joblack
 
joblack's Avatar
 
Oct 2008
n00bville

52×29 Posts
Default

I am wondering: How is the performance of the Ryzen 8 Core in comparison to the Intel processors? I haven't seen performance comparisons so far. For games there need to be some performance optimizations. For Prime95 it will be the same I presume?
joblack is offline   Reply With Quote
Old 2017-03-16, 12:25   #138
mackerel
 
mackerel's Avatar
 
Feb 2016
UK

2×191 Posts
Default

Based on my testing so far with 1700, IPC in FMA3 is roughly 1/2 of a modern Intel. I don't know if optimisations will change that significantly. If you want to do that kinda thing, stick with Intel for now. At a minimum, the Ryzen platform as a whole needs to mature, and it will probably help once software starts to explicitly optimise for it.
mackerel is offline   Reply With Quote
Old 2017-03-19, 20:09   #139
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT)

131008 Posts
Default

Agner Fog hasn't managed to get his hands on a Ryzen cpu yet to update his instruction tables.
If anyone has Linux running on their Ryzen cpu and is willing to let him remote control for a while please email him. http://www.agner.org/contact/?e=0#0
This could give us vital clues for helping Prime95 performance catch up to Intel on Ryzen cpus.
henryzz is online now   Reply With Quote
Old 2017-03-24, 18:10   #140
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013
Ͳօɾօղէօ

28·11 Posts
Default

Apparently Ryzen can do up to DDR4-2666 with quad stick, dual rank memory:

http://www.legitreviews.com/amd-ryze...ormance_192960
Mark Rose is offline   Reply With Quote
Old 2017-04-01, 00:13   #141
tului
 
Jan 2013

22×17 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
Apparently Ryzen can do up to DDR4-2666 with quad stick, dual rank memory:

http://www.legitreviews.com/amd-ryze...ormance_192960
They're updating the AGESA and should have even better RAM support soon I hope. Be nice to see 3200+ be the norm rather than sort of a hobbled, "maybe it'll work".

https://www.bit-tech.net/news/hardwa...update-april/1

"The update will be followed by AGESA 1.0.0.5 in May, Hallock continued, featuring improvements for overclocking DDR4 memory." from the article linked. Anyone else heard any B2 stepping rumors or facts?
tului is offline   Reply With Quote
Old 2017-04-10, 17:54   #142
mackerel
 
mackerel's Avatar
 
Feb 2016
UK

2·191 Posts
Default

Just noticed a bios update was made available for my mobo which includes AGESA 1.0.0.4. One of the fixes in that is resolving the FMA3 bug where the system locks up or reboots with a particular sequence of instructions. I've not encountered it with Prime95, but have replicated the original report with other software. Of other interest is a claimed reduction in memory latency. This does not appear to be the update to enable faster ram support, presumably which will come later still.

Last fiddled with by mackerel on 2017-04-10 at 17:55
mackerel is offline   Reply With Quote
Old 2017-04-11, 07:58   #143
db597
 
db597's Avatar
 
Jan 2003

CB16 Posts
Default Ryzen 1700 benchmark results

Below are the results from my Ryzen 1700 (non-X) with all cores set at 3.32GHz (stock rating 3GHz / Turbo 3.7GHz). Memory is running at 2933GHz CAS16, using AGESA 1.0.0.4a, which is the version with the 6ns latency improvement and the latest available at this time. Operating system is Windows 10 x64. I rearranged the benchmark results below for a bit easier reading / comparison.

AMD Ryzen 7 1700 Eight-Core Processor
CPU speed: 3318.72 MHz, 8 hyperthreaded cores
CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 32 KB
L2 cache size: 512 KB, L3 cache size: 16 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 64
L2 TLBS: 1536
Prime95 64-bit version 29.1, RdtscTiming=1

Timings for 1024K FFT length (1 cpu, 1 worker): 7.83 ms. Throughput: 127.69 iter/sec.
Timings for 1280K FFT length (1 cpu, 1 worker): 9.88 ms. Throughput: 101.17 iter/sec.
Timings for 1536K FFT length (1 cpu, 1 worker): 11.97 ms. Throughput: 83.57 iter/sec.
Timings for 1792K FFT length (1 cpu, 1 worker): 14.58 ms. Throughput: 68.60 iter/sec.
Timings for 2048K FFT length (1 cpu, 1 worker): 16.05 ms. Throughput: 62.29 iter/sec.
Timings for 2560K FFT length (1 cpu, 1 worker): 20.60 ms. Throughput: 48.55 iter/sec.
Timings for 3072K FFT length (1 cpu, 1 worker): 24.87 ms. Throughput: 40.20 iter/sec.
Timings for 3584K FFT length (1 cpu, 1 worker): 29.90 ms. Throughput: 33.44 iter/sec.
Timings for 4096K FFT length (1 cpu, 1 worker): 34.18 ms. Throughput: 29.26 iter/sec.
Timings for 5120K FFT length (1 cpu, 1 worker): 42.60 ms. Throughput: 23.48 iter/sec.
Timings for 6144K FFT length (1 cpu, 1 worker): 50.67 ms. Throughput: 19.74 iter/sec.
Timings for 7168K FFT length (1 cpu, 1 worker): 60.12 ms. Throughput: 16.63 iter/sec.
Timings for 8192K FFT length (1 cpu, 1 worker): 68.76 ms. Throughput: 14.54 iter/sec.

Timings for 1024K FFT length (8 cpus, 1 worker): 1.13 ms. Throughput: 886.42 iter/sec.
Timings for 1280K FFT length (8 cpus, 1 worker): 1.42 ms. Throughput: 704.55 iter/sec.
Timings for 1536K FFT length (8 cpus, 1 worker): 1.71 ms. Throughput: 584.87 iter/sec.
Timings for 1792K FFT length (8 cpus, 1 worker): 2.10 ms. Throughput: 475.44 iter/sec.
Timings for 2048K FFT length (8 cpus, 1 worker): 2.39 ms. Throughput: 418.60 iter/sec.
Timings for 2560K FFT length (8 cpus, 1 worker): 3.96 ms. Throughput: 252.38 iter/sec.
Timings for 3072K FFT length (8 cpus, 1 worker): 4.97 ms. Throughput: 201.08 iter/sec.
Timings for 3584K FFT length (8 cpus, 1 worker): 5.97 ms. Throughput: 167.51 iter/sec.
Timings for 4096K FFT length (8 cpus, 1 worker): 6.92 ms. Throughput: 144.58 iter/sec.
Timings for 5120K FFT length (8 cpus, 1 worker): 7.32 ms. Throughput: 136.59 iter/sec.
Timings for 6144K FFT length (8 cpus, 1 worker): 9.37 ms. Throughput: 106.71 iter/sec.
Timings for 7168K FFT length (8 cpus, 1 worker): 10.96 ms. Throughput: 91.21 iter/sec.
Timings for 8192K FFT length (8 cpus, 1 worker): 12.69 ms. Throughput: 78.83 iter/sec.

Timings for 1024K FFT length (8 cpus, 8 workers): 11.30, 11.41, 11.28, 11.22, 11.18, 11.18, 11.21, 11.20 ms. Throughput: 711.26 iter/sec.
Timings for 1280K FFT length (8 cpus, 8 workers): 14.15, 14.51, 14.13, 14.15, 14.03, 14.05, 14.13, 14.16 ms. Throughput: 564.84 iter/sec.
Timings for 1536K FFT length (8 cpus, 8 workers): 16.81, 17.45, 16.96, 17.00, 16.84, 16.82, 16.91, 16.82 ms. Throughput: 472.01 iter/sec.
Timings for 1792K FFT length (8 cpus, 8 workers): 20.85, 21.81, 20.92, 21.12, 20.68, 20.92, 21.25, 20.77 ms. Throughput: 380.31 iter/sec.
Timings for 2048K FFT length (8 cpus, 8 workers): 22.60, 23.32, 22.76, 22.78, 22.54, 22.61, 22.61, 22.54 ms. Throughput: 352.17 iter/sec.
Timings for 2560K FFT length (8 cpus, 8 workers): 33.53, 34.97, 33.76, 34.34, 34.01, 33.93, 34.26, 33.98 ms. Throughput: 234.66 iter/sec.
Timings for 3072K FFT length (8 cpus, 8 workers): 41.23, 42.38, 41.51, 40.71, 40.84, 40.78, 40.87, 41.04 ms. Throughput: 194.34 iter/sec.
Timings for 3584K FFT length (8 cpus, 8 workers): 48.09, 49.43, 47.96, 48.77, 47.89, 47.32, 47.90, 47.23 ms. Throughput: 166.45 iter/sec.
Timings for 4096K FFT length (8 cpus, 8 workers): 56.27, 57.15, 55.09, 55.39, 55.64, 54.99, 54.88, 54.69 ms. Throughput: 144.14 iter/sec.
Timings for 5120K FFT length (8 cpus, 8 workers): 58.15, 60.30, 58.03, 57.82, 57.55, 57.00, 58.24, 57.01 ms. Throughput: 137.94 iter/sec.
Timings for 6144K FFT length (8 cpus, 8 workers): 70.59, 72.77, 71.30, 71.76, 70.77, 70.67, 70.83, 70.63 ms. Throughput: 112.43 iter/sec.
Timings for 7168K FFT length (8 cpus, 8 workers): 87.46, 87.18, 83.29, 83.81, 82.80, 83.61, 83.66, 83.11 ms. Throughput: 94.87 iter/sec.
Timings for 8192K FFT length (8 cpus, 8 workers): 99.83, 99.12, 96.13, 97.41, 96.20, 96.03, 96.76, 96.01 ms. Throughput: 82.33 iter/sec.
db597 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Intel Processor Speculations Mark Rose Hardware 109 2017-10-13 16:55
Cannonlake speculations henryzz Hardware 0 2017-03-03 19:49

All times are UTC. The time now is 15:15.

Tue Aug 11 15:15:57 UTC 2020 up 25 days, 11:02, 1 user, load averages: 2.84, 2.67, 2.55

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.