![]() |
|
|
#1 |
|
Jul 2003
2×5 Posts |
Dear list
I'm trying to run Mlucas on a old Digital Alpha Server 800 5/500 running Linux Red Hat 7.2 (for Alpha) I use the ev4 pre-compiled binary from Ernst Mayer's source code timings page: http://hogranch.com/mayer/gimps_timings.html According to the homepage the following steps are neded: 1) Download the programs you need 1a) The Mlucas configuration file 2) Run self-tests All until, and including, the self test works fine even though the screen output are a bit messy. But the prime calculations are working and I ger exact matches for the control strings. 3) Get exponents from PrimeNet 1) CONNECT TO THE PRIMENET SERVER 1a) CREATE AN ACCOUNT I already have an account for another computer... 2) SELECT MANUAL TESTS. 3) CHECK OUT EXPONENTS. I obtained the folowing task, and placed it in the worktodo.ini file like this: --- worktodo.ini --- DoubleCheck=9819503,64 DoubleCheck=9819533,64 --- end of worktodo.ini --- 4) FACTORING This is where things start to go wrong... Weather I do the sugested test "Mlucas-2.7b-ev4 2202517 0 80000000000000" or I just starts the work on the worktodo.ini file I get the same message. I guess the existance of the worktodo.ini file have priority over command line stuff ?!? Anyway when I run it it look like this: [mhv@DmuAxel selftest]$ ../Mlucas-2.7b-ev4 looking for worktodo.ini file... worktodo.ini file found...checking next exponent in range... forrtl: info: Fortran error message number is 59. forrtl: warning: Could not open message catalog: for_msg.cat. forrtl: info: Check environment variable NLSPATH and protection of /usr/lib/for_msg.cat. forrtl: severe (59): Message not found [mhv@DmuAxel selftest]$ Obvously something is missing from my computer, manely for_msg.cat. I guess it's Fortran related but have no further clue. Please how do I get on with employing this wonderful piece of hardware in the search??? :-) Martin@Hvidberg.net |
|
|
|
|
|
#2 |
|
Jul 2003
2×5 Posts |
ops:I see just now that I have been using the EV4 pre-compile, where I ofcause should have used the EV5, since the DIGITAL Alpha Server 800 5/500 is based on an Alpha EV5 CPU. This do not remove or change the problem - so please feel free to reply anyhow... Martin@Hvidberg.net |
|
|
|
|
|
#3 | |||||
|
∂2ω=0
Sep 2002
República de California
103·113 Posts |
Hi, Martin:
Quote:
Quote:
9819503 9819533 Quote:
Quote:
Let me know if you have any further problems. I plan to release an (LL-test only) of Mlucas.c by the end of the month, so once you get that you can expect to see your runtimes to drop dramatically vs. version 2.7b. Or, if you don't mind doing a quick build on your Alpha, do anon-ftp to hogranch.com, then cd pub/mayer/src/C mget *.h *.c and answer 'y' to all the individual file prompts of mget. Here are the alpha build instructions from the comments at the top of the Mlucas.c file: Quote:
Mlucas 15060013 768 <== FFT length (in K) 1 <== '1' here means 'do a short timing test, rather than lanching a full LL test' 99 <== If comparing reesult to a Fortran-version Res64, use 99 ietrations, rather than 100 0 <== Index of the FFT radix set to try - also try with 1,2,3... and use the one that gives the best timing for your full-length LL tests at this FFT length, by adding it to your mlucas.cfg file as described by the README instructions. 1 <== 1 here to turn on per-iteration roundoff error checking, 0 to turn it off For the above run, I get [code:1] Mlucas 2.7c ftp://hogranch.com/pub/mayer/README.html#2 INFO: Using prefetch. looking for worktodo.ini file... no worktodo.ini file found...switching to interactive mode. Enter exponent >15060013 Enter FFT length in K (set K = 0 for default FFT length) >768 Enter 0 to run a full LL test, any other integer for a self-test >1 Enter number of iterations for timing test >99 Enter index of radix set to be used for the FFT: (See file fft_radix.txt for a list of available choices; enter -1 to get the default) >0 Enter 1 to enable per-iteration error checking, 0 for no error checking >1 p is prime...proceeding with Lucas-Lehmer test... M15060013: using FFT length 768K = 786432 8-byte floats. this gives an average 19.149796803792317 bits per digit INFO: Using real*16 for FFT sincos and DWT weights tables inits. Using complex FFT radices 6 16 16 16 16 99 iterations of M15060013 with FFT length 786432 Res64: B7BECF87319A0EB8. AvgMaxErr = 0.210626132. Program: E2.7c Clocks = 00:00:08.233 [/code:1] and note that the Res64 matches that of the README self-test table. Good luck, -Ernst |
|||||
|
|
|
|
|
#4 |
|
Jul 2003
2·5 Posts |
Hi Ernst
Thanks for your thorugh and helpful reply. I't seems that it's now working, after I edited the worktodo.ini according to your instructions. I would like to go for the Mlucas-27c since you promise better performence. I have downloadet all the .c and .h files and am planning a compile. I'm kind of new to this platform, but I would like to use the Compaq/HP compiler ccc as they claims that it's better than cc and far better than gcc. I was just looking at your sugested compiler options and comparing with the ccc man page. It seems to make sence with somthing like: ccc -o Mlucas-2.7c-ev56 -O4 -inline speed -assume accuracy_sensitive -unroll 1 -arch ev56 -tune ev56 -Olimit 100000 *.c -lm Only I can't find the "-assume accuracy_sensitive" options in the ccc man page ? Even "man -K accuracy_sensitive" comes out blank. When compiling, with the above statement, I get a lot (14) statements all saying: "In this statement, type long double has the same representation as type double on this platform. (longdoublenyi)" I don't know if this means truble, or can just be ignored? When I run the example you sugest in your comments I get the following: ---8<--- [mhv@DmuAxel ver27c]$ ./Mlucas-2.7c-ev56 Mlucas 2.8x http://hogranch.com//mayer/README.html#2 INFO: Using prefetch. looking for worktodo.ini file... no worktodo.ini file found...switching to interactive mode. Enter exponent >15060013 Enter FFT length in K (set K = 0 for default FFT length) >768 Enter 0 to run a full LL test, any other integer for a self-test >1 Enter number of iterations for timing test >99 Enter index of radix set to be used for the FFT: (See file fft_radix.txt for a list of available choices; enter -1 to get the default) >0 Enter 1 to enable per-iteration error checking, 0 for no error checking >1 p is prime...proceeding with Lucas-Lehmer test... M15060013: using FFT length 768K = 786432 8-byte floats. this gives an average 19.149796803792317 bits per digit INFO: Using real* 8 for FFT sincos and DWT weights tables inits. Using complex FFT radices 6 16 16 16 16 99 iterations of M15060013 with FFT length 786432 Res64: B7BECF87319A0EB8. AvgMaxErr = 0.210626132. Program: E2.8x Clocks = 00:01:06.288 Done ... ---8<--- It seems to be the same Res64: as you get, but the perfomense sucks... 1 min 6 sec. compared to your 8 sec. What platforme were you using? It should be noted that another instance of Mlucas was running on the same machine, at the same time. [mhv@DmuAxel ver27c]$ ps -A | grep 'Mlucas' 21689 pts/5 00:52:34 Mlucas-2.7b-gen 9850 pts/4 00:00:36 Mlucas-2.7c-ev5 It also says 2.8x ? Is it 2.7c or 2.8x, schould I care? I'll not try to compile including the -fast option since the man ccc page sayes that float operations my give different results! But other ideeas are of cause very welcome... Best Regards & Thanks again Martin@Hvidberg.net |
|
|
|
|
|
#5 |
|
Jul 2003
2×5 Posts |
Hi Ernst
I have tried som different compiler options. See below: As you can se I can get as low as <42 sec. by striping irelevant options and inserting "-inline speed" I can get no where near your 8 sec. But I assume you are using a fast machine? I'll poperly have a closer look at the -inline options... :-) Martin --- Different compier options --- ccc-ernst: The one you sugested, slightely changes to fit ccc and ev56. ccc -o Mlucas-2.7c-ev56 -O4 -inline speed -assume accuracy_sensitive -unroll 1 -arch ev56 -tune ev56 -Olimit 100000 *.c -lm > Clocks = 00:00:42.271 ccc-plain: Stripped -inline, -assume and -0limit options ccc -o Mlucas-2.7c-ev56plain -O4 -unroll 1 -arch ev56 -tune ev56 *.c -lm > Clocks = 00:00:45.975 ccc-emmh: As ccc-ernst, but strip -assume option ccc -o Mlucas-2.7c-ev56emmh -O4 -inline speed -unroll 1 -arch ev56 -tune ev56 -Olimit 100000 *.c -lm > Clocks = 00:00:42.016 ccc-emmh2: As ccc-emmh, but also stripping -0limit option ccc -o Mlucas-2.7c-ev56emmh2 -O4 -inline speed -unroll 1 -arch ev56 -tune ev56 *.c -lm > Clocks = 00:00:41.898 |
|
|
|
|
|
#6 |
|
∂2ω=0
Sep 2002
República de California
103·113 Posts |
Hi, Martin:
Glad you got it to compile. Yes, the long double warnings are ignorable. I did my timing run on a 1 GHz ev68, which should be 2 to 2.5x faster per-cycle than an ev56. And yes, having another instance of Mlucas running on your machine might well throw off your timings, especially on a small-cache, relatively low-bandwidth machine like the ev56. I suggest you do your timing tests on an otherwise idle system. I hadn't played with the -inline flag much recently, since I didn't recall it making any appreciable difference when I tried it on the ev6, but based on your results it may be worth trying it out. In hindsight, I *would* expect the degree of inling to make more of a difference on the ev56, due its tiny (8 kB L1 data and 8 kB L1 instruction, 96 kB mixed D/I L2) caches. If you see a similar speedup from -inline speed on an idle system, I'll modify my compile tips appropriately. I'm going to also play with this on the ev6. Also note that Mlucas.c savefiles are not compatible with Fortran-version ones, so the sooner you can deploy the C binary on your system, the less wasted cycles you'll have (since the C version will likely be sufficiently faster that it'll make sense to simply rerun your current exponent using the newer code.) |
|
|
|
|
|
#7 |
|
∂2ω=0
Sep 2002
República de California
103×113 Posts |
I see no performance difference using -inline speed on either ev56 or ev6 under TruUnix. According to the compiler manpages, -inline speed is the default, so this makes sense. Now I'm curious as to why Martin would see a timing difference when building with this flag, since it's supposed to be invoked by default anyway.
Ah, I just tried the same option (using ccc) on an Alpha/Linux ev6 system, and there I do see a small (~4%) speedup. Alas, I don't have access to an ev56 running Linux, but as soon as Martin redoes his timings without any other obs running on his system, we'll know what kind of speedup to expect on that platform. |
|
|
|
|
|
#8 | |
|
Jul 2003
2·5 Posts |
Quote:
I most likely had a webbrowser and an xterm open , but thy were doing noting. The system is rather limited on RAM and uses +90% just having Linux running, so maybe that explains the bad performense. I can't reach the system from here, but if it serves a purpose I'll try rerun the earlier quoted test again on an "ideal as posible" system on monday. How bad is this performence anyway? I have only Ernst's 8 sec as a reference, but what about other systems? Is the system so slow that I should consider unplogging it for good? Or maybe reinstall Linux in a no GUI version? :-) Martin |
|
|
|
|
|
|
#9 | |
|
∂2ω=0
Sep 2002
República de California
103·113 Posts |
Quote:
|
|
|
|
|
|
|
#10 |
|
∂2ω=0
Sep 2002
República de California
103×113 Posts |
Neglected to mention in my previosu post about building and self-testing: If your system is unloaded (or has a constant load running on it), you can use the automated self-test feature of Mlucas.c to run through all available radix combos for each FFT length in a decently wide range and determine which is best at each length - just parse the timings for the various radix combos at each FFT length and modify your mlucas.cfg accordingly.
Mlucas -s {s|m|l|x} (pick one of the latter letters) run through lengths 128-512K, 576-2048K, 2304-4096K and 4608-8192K, respectively. 'Mlucas -s a' runs through all of these. Haven't had time to pull it all together in an updated readme file, unfortunately, hence this dribs-and-drabs trickle of information, for which I apologize. If only I could get paid to do this full-time... |
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Linux sucks - linker problems | Prime95 | Linux | 6 | 2010-12-20 19:39 |
| Defendmyname, Linux problems need help! | defendmyname | Linux | 2 | 2008-12-16 12:03 |
| Mlucas problems on Linux Alpha | shackan | Mlucas | 9 | 2006-01-17 20:04 |
| Weird Game and Prime 95 problems, may it be Hardware? | Arthanis | Hardware | 30 | 2005-01-07 11:16 |
| Please help--hardware problems. | SpecTheIntro | Hardware | 11 | 2004-03-21 05:55 |