mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   mfaktc: a CUDA program for Mersenne prefactoring (https://www.mersenneforum.org/showthread.php?t=12827)

James Heinrich 2011-04-21 05:59

[QUOTE=Xyzzy;259157]PrimeNet keeps giving us work like "foo,68,69" with an occasional "bar,69,70". How do we get work that takes more time?[/QUOTE]Hop on over to [url]http://v5www.mersenne.org/manual_assignment/[/url] and ask for TF work on exponents between [url=http://www.mersenneforum.org/showthread.php?t=10693]332198357 and 332245261[/url] (Uncwilly's pet project). If you want to give it something to chew on for a while, just change your "foo,68,69" to "foo,68,79" (or however much you want to chew on a single exponent; PrimeNet will current assign this range up to 2^77; Uncwilly wants to eventually [url=http://www.mersenneforum.org/showpost.php?p=259154&postcount=285]take it up to 2^82[/url]).

I've given up doing TF that high, since it takes a week to do just 2^78-2^79 on my 8800GT; your boxes should fare much better.

Ralf Recker 2011-04-21 07:25

New Linux drivers released...
 
[LEFT]FYI: The new Linux 270.41.06 drivers were released yesterday. One funny(?) entry from the readme
[LIST][*]Fixed a bug causing the X server to hang every 49.7 days on 32-bit platforms.[/LIST]Looks like the classic 32 bit unsigned int millisecond counter overflow.
[/LEFT]

Xyzzy 2011-04-21 09:01

Remember [URL="http://news.cnet.com/Windows-may-crash-after-49.7-days/2100-1040_3-222391.html"]this[/URL]?

:max:

James Heinrich 2011-04-21 09:10

[QUOTE=Xyzzy;259178]Remember [URL="http://news.cnet.com/Windows-may-crash-after-49.7-days/2100-1040_3-222391.html"]this[/URL]?[/QUOTE]Vividly. :smile:
It amused me no end at the time that nobody discovered that Win95 would crash [i]for a known, specific, explainable reason[/i] after 7 weeks of uptime until 3.5 years after its release. Spoke volumes about its stability :razz:
And yet NT4, which was out at the same time, was marvelously stable.

xilman 2011-04-21 10:16

[QUOTE=James Heinrich;259179]And yet NT4, which was out at the same time, was marvelously stable.[/QUOTE]There's an old joke about the stability of NT4 which I won't post here because of the family-friendly constraint. Mail me for a copy if you wish.

NT4 was stable as long as you didn't want to change anything, otherwise it had to be rebooted. Even something as simple as changing the IP address required a reboot.

NT4 was the standard operating system running at MSR when I joined them as a sysadmin. It wasn't too bad, but it wasn't anywhere near as good as some would claim. It most certainly wasn't suitable for use by the great majority of Microsoft's customers.

Paul

Christenson 2011-04-21 12:35

[QUOTE=James Heinrich;259179]Vividly. :smile:
It amused me no end at the time that nobody discovered that Win95 would crash [i]for a known, specific, explainable reason[/i] after 7 weeks of uptime until 3.5 years after its release. Spoke volumes about its stability :razz:
And yet NT4, which was out at the same time, was marvelously stable.[/QUOTE]

32-bit timer overflows causing system crashes is still with us; minix had it late last year, though it was scaled out to 2 years or so, and it was easily inspected for -- the scheduler didn't realize that 2^32-1 ticks was followed by 0 ticks. It also caused WinCE boxes to go down running air traffic control. And both of these are supposed to be reliable!

I still regard myself lucky to get more than a week out of any Windows box before having to take it down for memory leaks or other problems. Xubuntu is on day 38 right now, and will go down for software upgrade when I get one of those famously rare round tuits.

Xyzzy 2011-04-21 19:03

Here is a performance benchmark test. Our methodology is most certainly flawed![LIST][*]We used 4 "sequential" exponents from the 57,xxx,xxx range.[*]We allowed the box to return to idle temperature (CPU/GPU) before each test.[*]We did not use a checkpoint file so each run is identical, other than the number of instances and which combination of exponents are being used.[*]We ran the test until we saw "1000/4620" in the "class" column. (Well, it never hit 1000 exactly so we stopped the test when it crossed 1000.)[*]We used "NumStreams=3" and "CPUStreams=4".[*]The "time" (ETA) column is from the last line of output we saw. We think that at the stopping point we used the exponent is ~22% done.[/LIST]We forgot how much fun making ASCII boxes is!

[code]╔═════════╤════════╤════════╤════════╤════════╤════════╤══════╗
║instances│cpu_load│gpu_load│ave_rate│cpu_temp│gpu_temp│ time ║
╟─────────┼────────┼────────┼────────┼────────┼────────┼──────╢
║ 0│ 0%│ 0%│ n/a│ 29°C│ 32°C│ n/a║
║ 1│ 26%│ 52%│ 190M/s│ 54°C│ 52°C│ 9m10s║
║ 2│ 51%│ 92%│ 173M/s│ 63°C│ 62°C│10m08s║
║ 3│ 76%│ 95%│ 121M/s│ 68°C│ 65°C│12m29s║
║ 4│ 100%│ 97%│ 97M/s│ 71°C│ 66°C│14m44s║
╚═════════╧════════╧════════╧════════╧════════╧════════╧══════╝[/code]

Is it sane to use the average rate to determine overall throughput?

[code]╔═════════╤════════════════╗
║instances│ throughput ║
╟─────────┼────────────────╢
║ 1│190 × 1 = 190M/s║
║ 2│173 × 2 = 346M/s║
║ 3│121 × 3 = 363M/s║
║ 4│ 97 × 4 = 388M/s║
╚═════════╧════════════════╝[/code]

We interpret the data above to be that the CPU is filling the GPU "bucket" faster than the GPU can empty the "bucket". With 2 or more instances the GPU load is nearly topped out.

We must now decide whether to run 3 or 4 instances. Or maybe 2 instances and 2 trial factoring threads of Prime95?

We are sure there are a lot of things we have overlooked.

:max:

firejuggler 2011-04-21 19:18

i would say you should use 2 instance of mfaktc. after that, you don't get much speed improvement. In addition, you could use the computer as usual instead of loosing responsiveness.
It seems that, after 2 instances, you encounter a GPU bottleneck. so get a GTX 680 or whatever the next generation is;p. And after that, you will be CPU bound.

James Heinrich 2011-04-21 19:47

Agreed. Run 2 instances, and use the remaining two CPU cores for other work, but [i]not[/i] TF (the small (42M/s) extra throughput is still plenty more work than the CPUs could do by themselves on TF, so if TF is all you want just run 4 instances). But perhaps it would be more useful to do some P-1 or L-L with the maining two cores instead.

Ralf Recker 2011-04-21 20:08

I experienced a significant slowdown of both apps (mfaktc and mprime) when I ran a combination of two mfaktc instances on a GTX 470 and two P-1 tasks on a Core 2 Quad in the last few days. The memory interface of my old CPU is possibly a bottleneck.

TheJudger 2011-04-21 20:45

Sorry for my late reply, I was on a business trip.

[QUOTE=Xyzzy;259039]
Running one instance of the self test uses around 215W and 25-26% processor resources with an i5 2500. We did modify the ini file to set it from 3 to 10 NumStreams. We have no clue what that means. With 3 NumStreams the processor was nearly idle.
[/QUOTE]
Don't try to maximize the utilisation during the selftest. :wink:
NumStreams should be OK.

[QUOTE=Xyzzy;259039]
We spent all night and today trying to get the development system running under Linux. We followed every HOWTO and tried every distribution recommended by Nvidia. We were unsuccesful but we will soldier on.
...
Compiling mfaktc in Linux went perfectly. We think the problem we are experiencing is that we cannot "talk" to the GPU. We have /dev populated and all of the environment variables set and all of the libraries in the right places and stuff but we got very weird errors. The GPU shows up with 'lspci' and the Nvidia module shows up with 'lsmod'.
[/QUOTE]

Did you try to run 'nvidia-smi -a' as normal user and as root? If there is no X on the GPU it is common that the device is not created properly. This can be fixed with some udev fun...

[QUOTE=Xyzzy;259039]
Anyways, it is good to know they work but it is distressing that we are having so many issues with the Linux install. In an ideal world, we would like to install with Debian but most likely we will try an older version (11.1) of OpenSUSE since that is what Oliver is using. We are fairly familiar with SUSE because we used to use the for-pay SUSE Enterprise Desktop deal.
[/QUOTE]
If you choose openSUSE I would recommend 11.2 which is officially supported by Nvidia CUDA... 11.3 *should* work once you install the gcc-4.3.

[QUOTE=Xyzzy;259039]
Attached is a copy of a self test. Perhaps it is useful. The Windows install is a clean install with no modifications other than the Nvidia stuff.
[/QUOTE]
Looks fine to me. :smile:

Oliver


All times are UTC. The time now is 23:08.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.