mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2013-07-16, 23:49   #188
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

19·397 Posts
Default

Quote:
Originally Posted by pepi37 View Post
And what will be performance increase with AVX2? ( if it will be any)
AVX2 is all about integer operations. Prime95 FFTs use floating point operations.

I could double the TF speed using AVX2 - but one should really be using a Haswell system for LL instead.
Prime95 is online now   Reply With Quote
Old 2013-07-17, 00:27   #189
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

19·613 Posts
Default

Quote:
Originally Posted by Prime95 View Post
AVX2 is all about integer operations.
Aside from the FMA3 support, you mean? ;)

[I know, strictly speaking these are separate additions to the ISA, but since they appeared together in the same chip release I usually think of both the the 256-bit-wide SIMD-ints and the FMA3 as "AVX2".]

Of course for AMD, FMA support is "SSE5", and it's FMA4, not FMA3. Clear as mud.

Last fiddled with by ewmayer on 2013-07-17 at 00:30
ewmayer is offline   Reply With Quote
Old 2013-07-17, 00:48   #190
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

19×397 Posts
Default

and there are separate CPUID flags for FMA and AVX2. Yes, clear as mud.
Prime95 is online now   Reply With Quote
Old 2013-07-17, 03:11   #191
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

26·151 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Notice the temp increase on the small FFT torture test! These small FFTs operate out of the L2 cache. I'm going to try a really small torture test that operates out of the L1 cache to see if I can get the temps even higher.
Wouldn't be much simpler to disconnect your fans?
LaurV is offline   Reply With Quote
Old 2013-07-17, 03:38   #192
TheMawn
 
TheMawn's Avatar
 
May 2013
East. Always East.

110101111112 Posts
Default

But shouldn't there be a massive slowdown on four workers if the memory is bottlenecked even at two workers?

Like I said in my previous post, my test had 11, 11, 11, and then finally 14 milliseconds once the fourth worker was added.

In George's test, they are getting progressively longer, yes, but the bottleneck begins right at two workers.
TheMawn is offline   Reply With Quote
Old 2013-07-17, 12:10   #193
lycorn
 
lycorn's Avatar
 
"GIMFS"
Sep 2002
Oeiras, Portugal

5C216 Posts
Default

@George,

Was the system stable enough @4.2 GHz to be considered fit for day-to-day crunching? (day-to-day meaning 24/7 LLtesting)
And did you take any power consumption measurement at that speed?
Thx
lycorn is offline   Reply With Quote
Old 2013-07-17, 14:11   #194
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

19×397 Posts
Default

Quote:
Originally Posted by lycorn View Post
Was the system stable enough @4.2 GHz to be considered fit for day-to-day crunching? And did you take any power consumption measurement at that speed?
Yes, I successfully completed about 10 doublechecks at that speed before switching to first time tests (version 27.9).

I have not taken any power consumption measurements.
Prime95 is online now   Reply With Quote
Old 2013-07-17, 14:24   #195
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

19×397 Posts
Default

Quote:
Originally Posted by TheMawn View Post
But shouldn't there be a massive slowdown on four workers if the memory is bottlenecked even at two workers?
Let's do a "what if".

What if each iteration did no FPU operations? That is, all it did was read and write memory. One worker would take about 5ms. Now when you start the second worker, *every* time it wants to read or write from memory it must wait. Each worker will now take 10ms. Third worker 15ms. Fourth worker 20ms.

Prime95 isn't as bad as this what if case. Instead of worker two waiting *every* time it accesses memory, it only has to wait say 5 percent of the time -- a partial slowdown.

The key here is the faster the CPU portion of prime95, the more penalty you'll see as you add another worker.

Last fiddled with by Prime95 on 2013-07-17 at 14:25
Prime95 is online now   Reply With Quote
Old 2013-07-17, 14:41   #196
firejuggler
 
firejuggler's Avatar
 
Apr 2010
Over the rainbow

2×1,303 Posts
Default

i7-4960 tested
http://www.tomshardware.com/reviews/...rk,3557-5.html
firejuggler is offline   Reply With Quote
Old 2013-07-18, 02:22   #197
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

754310 Posts
Default

I created a torture test that uses really small FFTs that fit in the L1 data cache. Alas, it runs cooler than the FFTs that fit in the L2 data cache.
Prime95 is online now   Reply With Quote
Old 2013-07-18, 04:58   #198
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

19×613 Posts
Default

L2 caches are so much larger that the overall die heating is probably more when one has all levels of the on-chip memory hierarchy busy.
ewmayer is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Haswell-E Prelim. Benchmark sdbardwick Hardware 37 2015-02-10 18:49
Prime95 and Haswell Pleco Information & Answers 22 2014-07-13 16:03
Haswell Rig Mini-Geek Hardware 64 2014-05-27 13:22
Prime95 version 27.1 early preview, not-even-close-to-beta release Prime95 Software 126 2012-02-09 16:17
Missing mouse-over preview text retina Forum Feedback 1 2011-09-12 15:32

All times are UTC. The time now is 19:50.


Fri Aug 6 19:50:07 UTC 2021 up 14 days, 14:19, 1 user, load averages: 3.37, 3.25, 3.07

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.