![]() |
MLucas, submit results?
perhaps I'm being really numb here, and forgive me if I am, but the readme page is giving me instructions for submitting results.. but the link/button isn't there to press, so to speak.
Anyone know what page/link I use to submit results obtained running Mlucas? Thanks in advance. |
[URL]http://www.mersenne.org/manual_result/[/URL] accepts Mlucas inputs.
(admittedly, it's not at all visible that it accepts things other than Prime95/Mprime lines, but it lists "Mlucas lines found" after you give it results) Remember to log in to the site first so the credit goes to you and not ANONYMOUS. |
Gratitude
Ah..thanks for that.. .. for whatever reason, as I looked around the site, I just didn't see that one.
Cheers. |
cannot submitt mlucas results.txt
I have just finished my 1st LL test with Mlucas but it seems the primenet manual check-in form does not recognize the results.
Does anyone know why? Here is my results.txt: [QUOTE]INFO: primary restart file p67773569 not found...looking for secondary... INFO: no restart file found...starting run from scratch. INFO: primary restart file p67773569 not found...looking for secondary... INFO: no restart file found...starting run from scratch. INFO: primary restart file p67773569 not found...looking for secondary... INFO: no restart file found...starting run from scratch. INFO: primary restart file p67773569 not found...looking for secondary... INFO: no restart file found...starting run from scratch. INFO: primary restart file p67773569 not found...looking for secondary... INFO: no restart file found...starting run from scratch. M67773569 Roundoff warning on iteration 5175394, maxerr = 0.421875000000 M67773569 Roundoff warning on iteration 8635898, maxerr = 0.437500000000 M67773569 Roundoff warning on iteration 16462455, maxerr = 0.437500000000 M67773569 Roundoff warning on iteration 18912472, maxerr = 0.437500000000 M67773569 Roundoff warning on iteration 28470196, maxerr = 0.468750000000 Retrying iteration interval to see if roundoff error is reproducible. M67773569 Roundoff warning on iteration 28470008, maxerr = 0.500000000000 Retrying iteration interval to see if roundoff error is reproducible. M67773569 Roundoff warning on iteration 29200282, maxerr = 0.437500000000 M67773569 Roundoff warning on iteration 40912707, maxerr = 0.437500000000 M67773569 Roundoff warning on iteration 45488771, maxerr = 0.437500000000 M67773569 Roundoff warning on iteration 50502214, maxerr = 0.437500000000 M67773569 Roundoff warning on iteration 61959752, maxerr = 0.421875000000 M67773569 is not prime. Res64: 0A85E13CEFD7xxxx. Program: E14.1 M67773569 mod 2^36 = 5556347xxxx M67773569 mod 2^35 - 1 = 18548321243 M67773569 mod 2^36 - 1 = 13658393030 [/QUOTE] |
[QUOTE=alexvong1995;402162]I have just finished my 1st LL test with Mlucas but it seems the primenet manual check-in form does not recognize the results.
Does anyone know why? Here is my results.txt:[/QUOTE] Is that quoted text what you tried to paste into the results-submit form? Also, please PM me your p67773569.stat file - I want to look into that fatal-error retry around midway through, to see if the code handled it properly (this is a recently-added feature). The exponent is close to the default breakover point between 3584K and 3840K so the first dangerous [0.46875] roundoff error is not surprising, but the 0.500 one on the ensuing interval-retry needs looking into. (Your result should still be fine, if the code switched to the larger 3840K FFT length at the point). What kind of hardware (and on how many cores) did you run this on? |
1 Attachment(s)
Thanks author.
Yes the quoted text is what I submitted. The .stat file is in the attachment. I cannot pm because it does not support attachment and the file is too large to be in-line. [CODE]$ uname -a Linux debian 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt9-3~deb8u1 (2015-04-24) x86_64 GNU/Linux[/CODE][CODE]$ cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 16 model : 4 model name : AMD Phenom(tm) II X4 945 Processor stepping : 3 microcode : 0x10000c8 cpu MHz : 3012.706 cache size : 512 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr hw_pstate npt lbrv svm_lock nrip_save vmmcall bogomips : 6025.41 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm stc 100mhzsteps hwpstate processor : 1 vendor_id : AuthenticAMD cpu family : 16 model : 4 model name : AMD Phenom(tm) II X4 945 Processor stepping : 3 microcode : 0x10000c8 cpu MHz : 3012.706 cache size : 512 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 4 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr hw_pstate npt lbrv svm_lock nrip_save vmmcall bogomips : 6025.41 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm stc 100mhzsteps hwpstate processor : 2 vendor_id : AuthenticAMD cpu family : 16 model : 4 model name : AMD Phenom(tm) II X4 945 Processor stepping : 3 microcode : 0x10000c8 cpu MHz : 3012.706 cache size : 512 KB physical id : 0 siblings : 4 core id : 2 cpu cores : 4 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr hw_pstate npt lbrv svm_lock nrip_save vmmcall bogomips : 6025.41 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm stc 100mhzsteps hwpstate processor : 3 vendor_id : AuthenticAMD cpu family : 16 model : 4 model name : AMD Phenom(tm) II X4 945 Processor stepping : 3 microcode : 0x10000c8 cpu MHz : 3012.706 cache size : 512 KB physical id : 0 siblings : 4 core id : 3 cpu cores : 4 apicid : 3 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr hw_pstate npt lbrv svm_lock nrip_save vmmcall bogomips : 6025.41 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm stc 100mhzsteps hwpstate[/CODE] |
[QUOTE=alexvong1995;402162]I have just finished my 1st LL test with Mlucas but it seems the primenet manual check-in form does not recognize the results.
Does anyone know why?[/QUOTE]I had not seen a result line with that Mlucas version number. I have adjusted the manual results form to accept a wider range of Mlucas version numbers. I also re-submitted your result for you: [url]http://www.mersenne.org/M67773569[/url] |
Alex, thanks - I am rerunning your exponent on my Haswell quad, 4-cored it needs ~9 ms/iter - currently around iter 6M, no roundoff warnings yet, but note that the FMA-using build I'm using has slightly lower roundoff errors than your non-FMA one. (I can't tell instantly from your CPU diagnostics whether this model supports AMD's FMA4, but in any event I have no intention of coding for that, i.e. my code supports only the Intel-introduced FMA3.)
Now, re. the history captured in your stat file: o I see lots of restarts - not in itself unusual depending on system usage and stability - but also some one-restart-after-another inervals like this: [quote][Mar 29 03:04:09] M67773569 Iter# = 390000 clocks = 00:06:46.013 [ 0.0406 sec/iter] Res64: 756756DA0FC6E775. AvgMaxErr = 0.239570977. MaxErr = 0.343750000 Restarting M67773569 at iteration = 390000. Res64: 756756DA0FC6E775 M67773569: using FFT length 3584K = 3670016 8-byte floats. this gives an average 18.466832024710520 bits per digit Using complex FFT radices 224 16 16 32 Restarting M67773569 at iteration = 390000. Res64: 756756DA0FC6E775 M67773569: using FFT length 3584K = 3670016 8-byte floats. this gives an average 18.466832024710520 bits per digit Using complex FFT radices 224 16 16 32 [Mar 29 03:16:17] M67773569 Iter# = 400000 clocks = 00:06:14.689 [ 0.0375 sec/iter] Res64: E3228230BA5452AB. AvgMaxErr = 0.239515427. MaxErr = 0.343750000 Restarting M67773569 at iteration = 400000. Res64: E3228230BA5452AB M67773569: using FFT length 3584K = 3670016 8-byte floats. this gives an average 18.466832024710520 bits per digit Using complex FFT radices 224 16 16 32 ... [i][dozen more such 4-line restart msgs snipped][/i] ... Restarting M67773569 at iteration = 400000. Res64: E3228230BA5452AB M67773569: using FFT length 3584K = 3670016 8-byte floats. this gives an average 18.466832024710520 bits per digit Using complex FFT radices 224 16 16 32 [Mar 29 03:43:11] M67773569 Iter# = 410000 clocks = 00:06:24.857 [ 0.0385 sec/iter] Res64: 689B7E4ABD60FE46. AvgMaxErr = 0.239397778. MaxErr = 0.343750000[/quote] Were you playing with overclocking or some other system-setup stuff here? o The fatal-error occurrence about midway through is different than I expected. Instead of being in midst of some ongoing run (e.g. due to some kind of system/power/cosmic-ray glitch), in your case it happens almost immediately upon restart from the checkpoint file: [quote][Apr 18 06:38:39] M67773569 Iter# = 28470000 clocks = 00:06:17.968 [ 0.0378 sec/iter] Res64: 5FA073737FB6FF79. AvgMaxErr = 0.239484814. MaxErr = 0.375000000 Restarting M67773569 at iteration = 28470000. Res64: 5FA073737FB6FF79 M67773569: using FFT length 3584K = 3670016 8-byte floats. this gives an average 18.466832024710520 bits per digit Using complex FFT radices 224 16 16 32 M67773569 Roundoff warning on iteration 28470196, maxerr = 0.468750000000 <*** 196 iters into restart *** Retrying iteration interval to see if roundoff error is reproducible. Restarting M67773569 at iteration = 28470000. Res64: 5FA073737FB6FF79 M67773569: using FFT length 3584K = 3670016 8-byte floats. this gives an average 18.466832024710520 bits per digit Restarting M67773569 at iteration = 28470000. Res64: 5FA073737FB6FF79 <*** I need to look into why the restart message gets printed twice *** M67773569: using FFT length 3584K = 3670016 8-byte floats. this gives an average 18.466832024710520 bits per digit Using complex FFT radices 224 16 16 32 M67773569 Roundoff warning on iteration 28470008, maxerr = 0.500000000000 <*** On retry, get a different fatal error (both error level and iteration #) *** Retrying iteration interval to see if roundoff error is reproducible. Restarting M67773569 at iteration = 28470000. Res64: 5FA073737FB6FF79 M67773569: using FFT length 3584K = 3670016 8-byte floats. this gives an average 18.466832024710520 bits per digit M67773569: using FFT length 3584K = 3670016 8-byte floats. this gives an average 18.466832024710520 bits per digit Using complex FFT radices 224 16 16 32 M67773569: using FFT length 3584K = 3670016 8-byte floats. this gives an average 18.466832024710520 bits per digit Using complex FFT radices 224 16 16 32 [Apr 18 15:52:10] M67773569 Iter# = 28480000 clocks = 00:06:42.782 [ 0.0403 sec/iter] Res64: 74D0B0E5536739D7. AvgMaxErr = 0.239356785. MaxErr = 0.375000000[/quote] According to the notes-to-self in the code logic I use here, after the code hits the 0.50 error on the first retry, it should 'throw up its hands' and exit, as this sort of scenario is usually a sign of data corruption, and requires at least a code restart, and possibly a system reboot. According to the above the code instead did one more retry - again with the duplicate messaging - and this one was successful (last line of quoted snip). I shall investigate - we'll know (based on my DC run) in around 48 hours whether your run got past the above glitch OK. BTW, the code should have deposited persistent savefiles with custom suffixes every 10 Miters, so if my DC indicates the above fatal error somehow hosed your run, you can still rerun from the 20 Miter checkpoint file - but first let's wait on my DC to get to iter 28480000. James, thanks for the server-side update. Note that Mlucas version numbers will henceforth have a major index reflecting the calendar year, i.e. 14.*** means a 2014 release. (14.0 was the first 2014 release; 14.1 was the second, last December). |
Thanks for the primenet submission!
Yes, some of the things you mentioned did happen. I had data corruption back (or not since ext4 has journaling) then because of power failure. I remembered during fsck, serveral lines of 'clearing orphan inodes blablabla' was printed, but nothing bad happen afterwards. My computer was over-clocked during winter from 3000MHz to 3300MHz. I switched it back to 3000MHz when summer arrived because the summer in HK is hot and humid. I don't think it is the cause of the round-off errors since I double-checked some of my mprime results and they appears to be correct. (The only bad result was when I over-clocked my cpu to 3600MHz) One day my computer failed to boot and I found out it was due to ram slot failure. I transfered the ram from ram slot 1, 3 to ram slot 2, 4 and my computer booted again and passed memtest. |
Alex, our runs match through iter 19M ... still a day from seeing whether your run made it through the fatal-error-and-interval-retry at 28.47M intact, but that does mean the first three 0.4375-level roundoff errors you hit were all benign. (I allow this as the largest acceptable maxerr based on a fair bit of statistics, but still treat it as "threat level yellow" in terms of keeping an eye on such runs.)
My run has encountered no warning-level roundoffs so far, i.e. all ROEs have been < 0.40. [QUOTE=alexvong1995;402235]Yes, some of the things you mentioned did happen. I had data corruption back (or not since ext4 has journaling) then because of power failure. I remembered during fsck, serveral lines of 'clearing orphan inodes blablabla' was printed, but nothing bad happen afterwards. My computer was over-clocked during winter from 3000MHz to 3300MHz. I switched it back to 3000MHz when summer arrived because the summer in HK is hot and humid. I don't think it is the cause of the round-off errors since I double-checked some of my mprime results and they appears to be correct. (The only bad result was when I over-clocked my cpu to 3600MHz)[/QUOTE] I wasn't concerned about the ROEs <= 0.4375 (except if there were more than 10 or so of the largest of these), those all look consistent ... just the anomalously large fatal-ROE-almost-immediately-after-restart at iter 28.47M. Haven't yet had time yet to set up a debug simulation of that scenario - been working on a couple of must-fix bugs in my newly parallelized TF code. Found/fixed last of those a few hours ago, so will focus on the fatal-ROE/retry stuff tomorrow. |
Alex, I inserted some debug code into the mers_mod_square.c function (which has the key roundoff error handling logic) to make it easy to simulate the issue which I thought you had encountered in your run-restart at iter = 28.47M. Here are the highlights of that - first I did 10000 iterations (with no debug-code yet inserted) of the exponent used for the 2304K self-test (unimportant, I just picked one that was visible in my build/test xterm):
[i] M44207087: using FFT length 2304K = 2359296 8-byte floats. this gives an average 18.737405988905167 bits per digit Using complex FFT radices 288 16 16 16 [May 14 11:47:28] M44207087 Iter# = 10000 clocks = 00:01:09.224 [ 0.0069 sec/iter] Res64: C2FA6D57AA3DF3F1. AvgMaxErr = 0.276980321. MaxErr = 0.375000000 [/i] I killed the run at this point, and added some debug code which made the control logic 'think' a ROE = 0.46 was hit on [restart iter + 78], then simulated on the ensuing retry of the iteration interval. Then I restarted under gdb (allowing me to step through the relevant code) - everything behaved as designed, here is the stat file output: [i] Restarting M44207087 at iteration = 10000. Res64: C2FA6D57AA3DF3F1 M44207087: using FFT length 2304K = 2359296 8-byte floats. this gives an average 18.737405988905167 bits per digit Using complex FFT radices 288 16 16 16 M44207087 Roundoff warning on iteration 10078, maxerr = 0.460000000000 Retrying iteration interval to see if roundoff error is reproducible. Restarting M44207087 at iteration = 10000. Res64: C2FA6D57AA3DF3F1 M44207087: using FFT length 2304K = 2359296 8-byte floats. this gives an average 18.737405988905167 bits per digit M44207087 Roundoff warning on iteration 10026, maxerr = 0.500000000000 [/i] Now the only (minor) bug I found was that the final informational-message-to-user that gets printed after the 2nd roundoff error only got printed to stdout. I fixed that in my local code, here is the message that should have gotten printed at this point (followed by exit): [i] The error is not reproducible, but encountered a different ROE in the retry of the interval ... as this is an indicator of likely data corruption, quitting. Please restart the program at your earliest convenience.[/i] Notice one thing about the above diagnostics: The "Using complex FFT radices" line gets printed just once, because the retries in question all use the original FFT length. If the ROE encountered is fatal (> 0.4375) but "normal", i.e. simply a result of the FFT length and the data in question leading to a too-high error level, then you would see the same error (both magnitude and iteration#) on the retry attempt, and then you would see an added diagnostic indicating the program has switched to the next-higher FFT length and restarted from the checkpoint in question. In your data we don't see any next-larger-FFT-length stuff: [i] Restarting M67773569 at iteration = 28470000. Res64: 5FA073737FB6FF79 M67773569: using FFT length 3584K = 3670016 8-byte floats. this gives an average 18.466832024710520 bits per digit Using complex FFT radices 224 16 16 32 M67773569 Roundoff warning on iteration 28470196, maxerr = 0.468750000000 Retrying iteration interval to see if roundoff error is reproducible. Restarting M67773569 at iteration = 28470000. Res64: 5FA073737FB6FF79 M67773569: using FFT length 3584K = 3670016 8-byte floats. this gives an average 18.466832024710520 bits per digit Restarting M67773569 at iteration = 28470000. Res64: 5FA073737FB6FF79 M67773569: using FFT length 3584K = 3670016 8-byte floats. this gives an average 18.466832024710520 bits per digit [/i] I now see that I was parsing this wrong - what I thought was duplicate printing of the last 6 lines above (same 3-lines repeated twice) should instead most probably (can't bee 100% sure since I wasn't there) be read as follows: [i] M67773569 Roundoff warning on iteration 28470196, maxerr = 0.468750000000 Retrying iteration interval to see if roundoff error is reproducible. Restarting M67773569 at iteration = 28470000. Res64: 5FA073737FB6FF79 M67773569: using FFT length 3584K = 3670016 8-byte floats. this gives an average 18.466832024710520 bits per digit [*** above run got killed ***] [*** start of a new run: ***] Restarting M67773569 at iteration = 28470000. Res64: 5FA073737FB6FF79 M67773569: using FFT length 3584K = 3670016 8-byte floats. this gives an average 18.466832024710520 bits per digit Using complex FFT radices 224 16 16 32 M67773569 Roundoff warning on iteration 28470008, maxerr = 0.500000000000 Retrying iteration interval to see if roundoff error is reproducible. [/i] In other words, you hit almost-instant fatal errors - but not of the reproducible kind - on 2 successive restart attempts. Perhaps you were playing with OCing and had the clock speed jacked up higher than safe here. We then see what appears to be the standard "retry to see if reproducible" messaging, but again with 2x duplication of a 2-line sequence, and I don't have a ready explanation for that 2x-ing - do you pipe stdout to the stat file, perhaps? [i] Restarting M67773569 at iteration = 28470000. Res64: 5FA073737FB6FF79 M67773569: using FFT length 3584K = 3670016 8-byte floats. this gives an average 18.466832024710520 bits per digit M67773569: using FFT length 3584K = 3670016 8-byte floats. this gives an average 18.466832024710520 bits per digit [/i] The next few lines don't clear up my puzzlement: [i] Using complex FFT radices 224 16 16 32 M67773569: using FFT length 3584K = 3670016 8-byte floats. this gives an average 18.466832024710520 bits per digit Using complex FFT radices 224 16 16 32 [/i] The 2 "Using complex FFT radices" lines make it look like the code got restarted while the above interval-retry was still in progress. Anyhoo, my run just hit iter 28480000, the result matches yours, so we can rest easy about the above funny business possibly corrupting your run data. "The integrity of our customers' run data is our #1 priority." :) |
Yeah! So I have stepped through that fetal error and retry safely. I really need to learn gdb so that I can also debug my code like what you did. Pointers and segfaults are the most confusing things to newbie C programmers. [URL="http://c2.com/cgi/wiki?ThreeStarProgrammer"]Three Star Programmer[/URL] is a good old joke. Regarding the duplicated lines, I think I should not be piping stdout to the stat file. Actually, I run mlucas by this command in tty1[CODE]$ nice ./Mlucas &> /dev/null[/CODE]My .bashrc is as followed: if it is tty1, run mlucas with the above command. The ttys are set up to be auto-logined. Will piping to /dev/null affect anything?
|
Building Mlucas for i386 on amd64
1 Attachment(s)
[QUOTE=ewmayer]
Again, I would be interested in seeing the specific error message(s) you get - this sounds like it might be the ran-out-of-registers issue I mentioned, since unoptimized builds need more registers for the C code surrounding the inline-asm.[/QUOTE] (The message becomes too long so I have to post it here.) Yes. I think you are right, optimization makes better use of registers thus eliminating the run-out-register error. About the i386 build, I have managed to build after applying a patch to the source code (by trial-and-error) and build with [CODE]$ gcc -m32 -Di386_build -o mlucas *.c -lm[/CODE]I try to build without -DUSE_SSE2 and -DUSE_THREADS because it is the case where mlucas almost get built (90/95 source files get built vs 5x/95 in the -DUSE_SSE2, -DUSE_THREADS or both attempts) I find problem in radix44_ditN_cy_dif1.c and radix176_ditN_cy_dif1.c. Compiler complains about undeclared a0, a1 ... a9, b0, b1 ... b9 in the expansion of radix44_main_carry_loop.h and radix176_main_carry_loop.h respectively. So what I do is to copy the declreation inside the #ifdef USE_SSE2 ... #endif in front of the #include radix44_main_carry_loop.h, and it seems working magically. The problems and the solutions of radix44_ditN_cy_dif1.c and radix176_ditN_cy_dif1.c are identical. Besides, there is also problem in get_fft_radices.c. When I try to compile, there are two case 7.[CODE] case 7 : numrad = 6; rvec[0] = 16; rvec[1] = 8; rvec[2] = 8; rvec[3] = 8; rvec[4] = 8; rvec[5] = 16; break; #ifndef USE_ONLY_LARGE_LEAD_RADICES case 7 : numrad = 5; rvec[0] = 8; rvec[1] = 16; rvec[2] = 16; rvec[3] = 32; rvec[4] = 16; break; [/CODE]So when USE_ONLY_LARGE_LEAD_RADICES is defined, there will be 2 case 7, causing compiler error. I try to add something like #ifdef USE_ONLY_LARGE_LEAD_RADICES. It does works, but mlucas will segfault after it finishes self-testing via [CODE]$ ./mlucas -s m[/CODE] So instead I try to add -DUSE_SSE2 when compiling this particular source file and it works just fine. But then I see -DUSE_SSE2 does nothing but to #define USE_ONLY_LARGE_LEAD_RADICES. So I add #if ... || defined(i386_build) #define USE_ONLY_LARGE_LEAD_RADICES #endif instead to simplify things. Finally, when I do the self-testing mentioned above, mlucas exit after a fetal error when using 144 as one of its radices and the error message comes from [CODE] default : sprintf(cbuf,"FATAL: radix %d not available for ditN_cy_dif1. Halting...\n",radix_vec0); fprintf(stderr,"%s", cbuf); ASSERT(HERE, 0,cbuf); [/CODE] in mers_mod_square.c The solution I attempted is to add a case 144 by copying case 288 below. This prevents mlucas from exiting after the fetal error occurs. Instead, a fetal error caused by insane ROE occurs, mlucas halt testing and try another radix set instead of 144. These are what I find out last night. The patch is included in the attachment. I define the new feature test macro i386_build just to prevent the new changes break existing code. Maybe I will post the self-testing result after I am done with it. |
1 Attachment(s)
Alex, I reran the first ~30M iterations of your test on my Haswell quad and the rest on my new Broadwell NUC, final result matches yours. Max roundoff error I encountered using an AVX2-mode (that is an FMA-using) build was 0.375 (3 such during the run). Zipped .stat file attached.
|
[QUOTE=ewmayer;403315]Alex, I reran the first ~30M iterations of your test on my Haswell quad and the rest on my new Broadwell NUC, final result matches yours. Max roundoff error I encountered using an AVX2-mode (that is an FMA-using) build was 0.375 (3 such during the run). Zipped .stat file attached.[/QUOTE]
I'm doing a full double-check of M67773569 on Prime95 so we should see how that does in ~58 hours. |
[QUOTE=Madpoo;403735]I'm doing a full double-check of M67773569 on Prime95 so we should see how that does in ~58 hours.[/QUOTE]
Done, it matched. [URL="http://www.mersenne.org/M67773569"]M67773569[/URL] |
Thanks for the verify - I had no doubts after my DC, but because my code isn't officially allowed to do both the 1st and 2nd runs (no power-of-2 residue shift is used), you just made it official.
Does your logfile indicate what the max ROE for your run was? |
[QUOTE=ewmayer;403940]Thanks for the verify - I had no doubts after my DC, but because my code isn't officially allowed to do both the 1st and 2nd runs (no power-of-2 residue shift is used), you just made it official.
Does your logfile indicate what the max ROE for your run was?[/QUOTE] I'm just running Prime95 so only if it went over 0.4 at any point. I don't have that saved but it doesn't happen often and I tend to remember when it does. Well, not a specific exponent, but if I've seen one at all in the past few days. And I don't think I saw any the day I checked that one in. I *think* it used the 3584K FFT size. Again, I kind of remember paying attention to that since it came up, and I made a mental note of that. |
| All times are UTC. The time now is 05:59. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.