mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2011-07-14, 18:44   #1
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

1C1D16 Posts
Default Different Speed in different OS's

Hi guys,
I recently just set up a common working directory for Prime95 for both my Windows and Linux partitions. Whenever I run MPrime in Linux, on a first time LL test with exponent in the 53 million range, I get around .086 seconds per iteration. Right now I'm running exactly the same job in Windows, and am getting seconds per iteration around .069-.072. Why is the Windows version better than Linux? Both are 64-bit versions. Processor is an AMD Phenom II X6 1055T, six cores at 2.8 GHz apiece. In Linux I got my numbers after shutting just about everything down, including the graphical user interface (didn't actually improve performance much, but that's just to show nothing is using the cpu's besides MPrime). Is there anything I can do to speed up Linux, or is this something in the code not optimized as heavily for Linux?

Thanks,
Dubslow
Dubslow is offline   Reply With Quote
Old 2011-07-14, 19:31   #2
imwithid
 
imwithid's Avatar
 
Apr 2009
Venice, Chased by Jaws

79 Posts
Default

That may depend on how 'fresh' your installation is so far and what you've installed.

Admittedly, my folder share setup is the same as yours and I get a similar result as you do, however, it is not a significant difference.

When I run Ubuntu 10.04.2 x64 and mprime 26.6 on my Intel Core 2 6300 on 10,000 iteration intervals, I get about 0.59. This system is fairly 'clean'. I don't install too many applications however, I primarily use Ubuntu over Windows 95% of the time.

When I run Windows XP Professional x64 and prime95 26.6 on the same system, it can get as low as .057.

These are times derived from idle periods. If your times vary greatly, that may be the result of how resources are set to be utilized (whether by the OS or mprime/prime95). I leave all settings at their default on all that apply.

This is contrary to what I would expect, however, having installed many systems I find that, overall, linux consumes more processing resources after a clean install than does Windows [XP] even after the drivers and some applications are installed.

A curious side note: I've used a power meter to check the current on both OSes after a clean install and that Core 2 uses 88W at idle via Windows and 92W at idle via Ubuntu. I repeated this twice and allowed a full five minutes to reach idle state. Not much of a difference but I thought this would give some credence to the comparison between System Monitor (Ubuntu) and Task Manager (Windows) in reporting CPU usage.

As for Vista and later, I have little to no experience so I cannot say.

Last fiddled with by imwithid on 2011-07-14 at 19:32 Reason: Grammar
imwithid is offline   Reply With Quote
Old 2011-07-14, 21:20   #3
markr
 
markr's Avatar
 
"Mark"
Feb 2003
Sydney

10578 Posts
Default

Have you tried assigning each worker to a specific core? It may not help you as much as it did with my core2quad with its cores assembled in pairs, but it's worth a try.

Rather than the software not being optimised for Linux, it may be more that Linux is not as optimised for the hardware. In general, hardware drivers, eg for graphics cards, for Windows have far more resources poured into them by the hardware companies than Linux ones do.
markr is offline   Reply With Quote
Old 2011-07-15, 00:19   #4
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

24×107 Posts
Default

Also worth thinking about: P95/mprime gets faster over time (GW is still putting effort into optimisations, like better FFT sizes and the latest instructions), so you should ensure you are comparing the same version....
Christenson is offline   Reply With Quote
Old 2011-07-15, 01:47   #5
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

8,861 Posts
Default

Quote:
Originally Posted by markr View Post
Have you tried assigning each worker to a specific core? It may not help you as much as it did with my core2quad with its cores assembled in pairs, but it's worth a try.
As a follow up to that, make sure that both have the same number of workers working on the same number of assignments.
Uncwilly is online now   Reply With Quote
Old 2011-07-24, 14:04   #6
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·2,399 Posts
Default Huh...

Having been away at camp for a week, I am now back with some interesting observations. When I ran the Minecraft server I host in Windows, suddenly my iteration times went to around .086 for some workers, and up to ~.115 for one thread, which is worse than when I run it in linux, even with the server up. The thing about linux is that there the server is set to start on boot, so when I tested it by killing the gui and killing the server, my guess would be that something about starting the server at boot and then stopping it put all the threads at .086 seconds.
Now for the most interesting part: When I got back, out of some minor sort of ocd, I assigned some TF work for workers 2-6 because (understandably) worker 1 was falling behind. As soon as I did that, the iteration time for worker 1 dropped to .063, for a couple of hours, went up to .071 for an hour, and then settled down to ~.066 seconds per iteration, which is better than I ever got in windows. Now, since last night, one of the TF workers found a factor, and so has resumed LL testing. Now that worker (4, I think) has iteration times around .063-.065 seconds, while worker 1 now is at .076 seconds per iteration. And this is all with people connected to my server, and my gui running as normal. Any thoughts?
Dubslow is offline   Reply With Quote
Old 2011-07-25, 04:47   #7
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

11100000111012 Posts
Default

Well, I can't find a way to edit my above post, so I'll put this link here: Thread on why specific cpu assignment isn't working, and I think this will help because of the facts in the post above.
http://mersenneforum.org/showthread....455#post267455


Edit: HAHA!!! Now that I've got assignments worked out, it works wonders! Now all my threads average .060-.065 iterations per second. :D I'll need to pretty much reexamine the difference in OS's now that I've got this worked out.

Last fiddled with by Dubslow on 2011-07-25 at 05:05 Reason: I'm a genius (or at least decent as solving my own problems with help from others)
Dubslow is offline   Reply With Quote
Old 2011-07-28, 14:46   #8
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×2,399 Posts
Default

Even despite having gotten the workers assigned to cpu's, I'm still finding that putting one or more workers on trial factoring speeds up the other LL workers by 2-4 ms per iteration per TF worker. (So that with three TF and three LL, the three LL work ~8 ms per iterations faster than all doing LL (71ms - 63 ms))
Dubslow is offline   Reply With Quote
Old 2011-07-29, 22:44   #9
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

160358 Posts
Default

Remarkable... A couple of days ago, my iteration times spiked to .1 s in Linux and .09 in Windows, a significant hit. I can think of no reason under the sun why that might have happened. So on a hunch just now I started one thread to TF, and now all the other 5 LL are back to normal .065 s iteration times. How weird.
Dubslow is offline   Reply With Quote
Old 2011-07-30, 12:49   #10
joblack
 
joblack's Avatar
 
Oct 2008
n00bville

52·29 Posts
Default

Quote:
Originally Posted by markr View Post
Rather than the software not being optimised for Linux, it may be more that Linux is not as optimised for the hardware. In general, hardware drivers, eg for graphics cards, for Windows have far more resources poured into them by the hardware companies than Linux ones do.
Despite the graphic card driver everything else in the Linux core is highly optimized. There were and are countless highly skilled programmers who are thinking about ways to improve the kernel performance. I would rather think the mprime binary isn't compiled with the right options. Maybe the compiler didn't thought about optimization on the Linux part.
joblack is offline   Reply With Quote
Old 2011-08-01, 14:11   #11
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

23×5×251 Posts
Default

Here is an example of the same situation, but with Win XP 32 bit versus Win 7 64 bit. The 64 bit version seems to run .005 sec. faster than the 32 bit version.

This is on a dual-boot, Phenom II x6 1090T system. It is currently running at 17x205=~3.5GHz. It has 8GB of RAM running at PC1600. The mobo is an Asus M4A89GTD Pro/USB3.

Note that P95 is running the same exponents on both OS's. The exponents are arranged in ascending order for easy comparison.
Attached Thumbnails
Click image for larger version

Name:	Prime95_64+32bit.jpg
Views:	94
Size:	235.4 KB
ID:	6871  
kladner is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
GM200 will not speed up LL Karl M Johnson GPU Computing 2 2015-10-10 05:43
TF speed Unregistered Information & Answers 10 2011-07-27 12:34
LL test speed up? jebeagles Miscellaneous Math 16 2006-01-04 02:43
Question on speed lpmurray Software 1 2005-06-24 02:54
Speed issues... Xyzzy Lounge 42 2003-10-08 01:27

All times are UTC. The time now is 19:54.

Mon Nov 23 19:54:07 UTC 2020 up 74 days, 17:05, 3 users, load averages: 2.33, 2.37, 2.48

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.