mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2011-04-21, 05:59   #749
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

23×149 Posts
Default

Quote:
Originally Posted by Xyzzy View Post
PrimeNet keeps giving us work like "foo,68,69" with an occasional "bar,69,70". How do we get work that takes more time?
Hop on over to http://v5www.mersenne.org/manual_assignment/ and ask for TF work on exponents between 332198357 and 332245261 (Uncwilly's pet project). If you want to give it something to chew on for a while, just change your "foo,68,69" to "foo,68,79" (or however much you want to chew on a single exponent; PrimeNet will current assign this range up to 2^77; Uncwilly wants to eventually take it up to 2^82).

I've given up doing TF that high, since it takes a week to do just 2^78-2^79 on my 8800GT; your boxes should fare much better.
James Heinrich is offline   Reply With Quote
Old 2011-04-21, 07:25   #750
Ralf Recker
 
Ralf Recker's Avatar
 
Oct 2010

191 Posts
Default New Linux drivers released...

FYI: The new Linux 270.41.06 drivers were released yesterday. One funny(?) entry from the readme
  • Fixed a bug causing the X server to hang every 49.7 days on 32-bit platforms.
Looks like the classic 32 bit unsigned int millisecond counter overflow.
Ralf Recker is offline   Reply With Quote
Old 2011-04-21, 09:01   #751
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

5·17·97 Posts
Default

Remember this?

Xyzzy is offline   Reply With Quote
Old 2011-04-21, 09:10   #752
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

1101011000112 Posts
Default

Quote:
Originally Posted by Xyzzy View Post
Remember this?
Vividly.
It amused me no end at the time that nobody discovered that Win95 would crash for a known, specific, explainable reason after 7 weeks of uptime until 3.5 years after its release. Spoke volumes about its stability
And yet NT4, which was out at the same time, was marvelously stable.
James Heinrich is offline   Reply With Quote
Old 2011-04-21, 10:16   #753
xilman
Bamboozled!
 
xilman's Avatar
 
"π’‰Ίπ’ŒŒπ’‡·π’†·π’€­"
May 2003
Down not across

2A2116 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
And yet NT4, which was out at the same time, was marvelously stable.
There's an old joke about the stability of NT4 which I won't post here because of the family-friendly constraint. Mail me for a copy if you wish.

NT4 was stable as long as you didn't want to change anything, otherwise it had to be rebooted. Even something as simple as changing the IP address required a reboot.

NT4 was the standard operating system running at MSR when I joined them as a sysadmin. It wasn't too bad, but it wasn't anywhere near as good as some would claim. It most certainly wasn't suitable for use by the great majority of Microsoft's customers.

Paul
xilman is offline   Reply With Quote
Old 2011-04-21, 12:35   #754
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

34038 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Vividly.
It amused me no end at the time that nobody discovered that Win95 would crash for a known, specific, explainable reason after 7 weeks of uptime until 3.5 years after its release. Spoke volumes about its stability
And yet NT4, which was out at the same time, was marvelously stable.
32-bit timer overflows causing system crashes is still with us; minix had it late last year, though it was scaled out to 2 years or so, and it was easily inspected for -- the scheduler didn't realize that 2^32-1 ticks was followed by 0 ticks. It also caused WinCE boxes to go down running air traffic control. And both of these are supposed to be reliable!

I still regard myself lucky to get more than a week out of any Windows box before having to take it down for memory leaks or other problems. Xubuntu is on day 38 right now, and will go down for software upgrade when I get one of those famously rare round tuits.
Christenson is offline   Reply With Quote
Old 2011-04-21, 19:03   #755
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

5·17·97 Posts
Default

Here is a performance benchmark test. Our methodology is most certainly flawed!
  • We used 4 "sequential" exponents from the 57,xxx,xxx range.
  • We allowed the box to return to idle temperature (CPU/GPU) before each test.
  • We did not use a checkpoint file so each run is identical, other than the number of instances and which combination of exponents are being used.
  • We ran the test until we saw "1000/4620" in the "class" column. (Well, it never hit 1000 exactly so we stopped the test when it crossed 1000.)
  • We used "NumStreams=3" and "CPUStreams=4".
  • The "time" (ETA) column is from the last line of output we saw. We think that at the stopping point we used the exponent is ~22% done.
We forgot how much fun making ASCII boxes is!

Code:
╔═════════╀════════╀════════╀════════╀════════╀════════╀══════╗
β•‘instancesβ”‚cpu_loadβ”‚gpu_loadβ”‚ave_rateβ”‚cpu_tempβ”‚gpu_tempβ”‚ time β•‘
β•Ÿβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β•’
β•‘        0β”‚      0%β”‚      0%β”‚     n/aβ”‚    29Β°Cβ”‚    32Β°Cβ”‚   n/aβ•‘
β•‘        1β”‚     26%β”‚     52%β”‚  190M/sβ”‚    54Β°Cβ”‚    52Β°Cβ”‚ 9m10sβ•‘
β•‘        2β”‚     51%β”‚     92%β”‚  173M/sβ”‚    63Β°Cβ”‚    62Β°Cβ”‚10m08sβ•‘
β•‘        3β”‚     76%β”‚     95%β”‚  121M/sβ”‚    68Β°Cβ”‚    65Β°Cβ”‚12m29sβ•‘
β•‘        4β”‚    100%β”‚     97%β”‚   97M/sβ”‚    71Β°Cβ”‚    66Β°Cβ”‚14m44sβ•‘
β•šβ•β•β•β•β•β•β•β•β•β•§β•β•β•β•β•β•β•β•β•§β•β•β•β•β•β•β•β•β•§β•β•β•β•β•β•β•β•β•§β•β•β•β•β•β•β•β•β•§β•β•β•β•β•β•β•β•β•§β•β•β•β•β•β•β•
Is it sane to use the average rate to determine overall throughput?

Code:
╔═════════╀════════════════╗
β•‘instancesβ”‚   throughput   β•‘
β•Ÿβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β•’
β•‘        1β”‚190 Γ— 1 = 190M/sβ•‘
β•‘        2β”‚173 Γ— 2 = 346M/sβ•‘
β•‘        3β”‚121 Γ— 3 = 363M/sβ•‘
β•‘        4β”‚ 97 Γ— 4 = 388M/sβ•‘
β•šβ•β•β•β•β•β•β•β•β•β•§β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
We interpret the data above to be that the CPU is filling the GPU "bucket" faster than the GPU can empty the "bucket". With 2 or more instances the GPU load is nearly topped out.

We must now decide whether to run 3 or 4 instances. Or maybe 2 instances and 2 trial factoring threads of Prime95?

We are sure there are a lot of things we have overlooked.

Xyzzy is offline   Reply With Quote
Old 2011-04-21, 19:18   #756
firejuggler
 
firejuggler's Avatar
 
Apr 2010
Over the rainbow

260610 Posts
Default

i would say you should use 2 instance of mfaktc. after that, you don't get much speed improvement. In addition, you could use the computer as usual instead of loosing responsiveness.
It seems that, after 2 instances, you encounter a GPU bottleneck. so get a GTX 680 or whatever the next generation is;p. And after that, you will be CPU bound.

Last fiddled with by firejuggler on 2011-04-21 at 19:31
firejuggler is offline   Reply With Quote
Old 2011-04-21, 19:47   #757
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

23·149 Posts
Default

Agreed. Run 2 instances, and use the remaining two CPU cores for other work, but not TF (the small (42M/s) extra throughput is still plenty more work than the CPUs could do by themselves on TF, so if TF is all you want just run 4 instances). But perhaps it would be more useful to do some P-1 or L-L with the maining two cores instead.
James Heinrich is offline   Reply With Quote
Old 2011-04-21, 20:08   #758
Ralf Recker
 
Ralf Recker's Avatar
 
Oct 2010

191 Posts
Default

I experienced a significant slowdown of both apps (mfaktc and mprime) when I ran a combination of two mfaktc instances on a GTX 470 and two P-1 tasks on a Core 2 Quad in the last few days. The memory interface of my old CPU is possibly a bottleneck.
Ralf Recker is offline   Reply With Quote
Old 2011-04-21, 20:45   #759
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11×101 Posts
Default

Sorry for my late reply, I was on a business trip.

Quote:
Originally Posted by Xyzzy View Post
Running one instance of the self test uses around 215W and 25-26% processor resources with an i5 2500. We did modify the ini file to set it from 3 to 10 NumStreams. We have no clue what that means. With 3 NumStreams the processor was nearly idle.
Don't try to maximize the utilisation during the selftest.
NumStreams should be OK.

Quote:
Originally Posted by Xyzzy View Post
We spent all night and today trying to get the development system running under Linux. We followed every HOWTO and tried every distribution recommended by Nvidia. We were unsuccesful but we will soldier on.
...
Compiling mfaktc in Linux went perfectly. We think the problem we are experiencing is that we cannot "talk" to the GPU. We have /dev populated and all of the environment variables set and all of the libraries in the right places and stuff but we got very weird errors. The GPU shows up with 'lspci' and the Nvidia module shows up with 'lsmod'.
Did you try to run 'nvidia-smi -a' as normal user and as root? If there is no X on the GPU it is common that the device is not created properly. This can be fixed with some udev fun...

Quote:
Originally Posted by Xyzzy View Post
Anyways, it is good to know they work but it is distressing that we are having so many issues with the Linux install. In an ideal world, we would like to install with Debian but most likely we will try an older version (11.1) of OpenSUSE since that is what Oliver is using. We are fairly familiar with SUSE because we used to use the for-pay SUSE Enterprise Desktop deal.
If you choose openSUSE I would recommend 11.2 which is officially supported by Nvidia CUDA... 11.3 *should* work once you install the gcc-4.3.

Quote:
Originally Posted by Xyzzy View Post
Attached is a copy of a self test. Perhaps it is useful. The Windows install is a clean install with no modifications other than the Nvidia stuff.
Looks fine to me.

Oliver
TheJudger is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 01:35.


Fri Aug 6 01:35:50 UTC 2021 up 13 days, 20:04, 1 user, load averages: 2.68, 2.39, 2.36

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.