mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   NFS@Home (https://www.mersenneforum.org/forumdisplay.php?f=98)
-   -   Corrupt data files discovered during post-processing (https://www.mersenneforum.org/showthread.php?t=23575)

fivemack 2018-08-15 17:27

Corrupt data files discovered during post-processing
 
When I run

[code]
gunzip -c L3655A.dat.gz | nl | grep -a "000000[^0-9a-f,]" | tee every-millionth
[/code]

it suggests that there is a large block of corrupt data in the middle of L3655A.dat.gz, meaning that there are only 378M or so lines in the file despite the log reporting 455M relations.

I'm having a bit of difficulty working out which Q-values are affected by this, because some versions of the sieving client put the special-Q at the end of the list of factors and some sort the factors numerically, so given a line it's not trivial to find the special-Q that it was.

My inclination is to write a better special-Q-finder, figure out the corrupt ranges, and sieve them myself with 16e; I'm not sure I'll get that done before I go to Australia in eight days.

pinhodecarlos 2018-08-15 17:41

Is it possible to cheat on sieving?

fivemack 2018-08-15 21:38

At a first analysis, there are no special-Q larger than 966392000 in the output file despite sieving supposedly having been done to 1200M

I don't see any suspiciously large gaps before Q=880M or so.

fivemack 2018-08-15 21:46

With my embarrassed hat on, I also need to point out that I forgot to put an lss: 0 directive in the polynomial file so all the sieving was done on the lower-yielding side - I wondered why the yield was so low.

I will fix this, it will take a few tens of thousands of core hours at my side but that seems a reasonable penalty for me to pay.

RichD 2018-08-15 21:51

[QUOTE=fivemack;493978]I will fix this, it will take a few tens of thousands of core hours at my side but that seems a reasonable penalty for me to pay.[/QUOTE]

You are only human like the rest of us. I have no problem letting the grid help this number out...

swellman 2018-08-16 00:20

Agreed. Penance is for sins, not simple mistakes. Use the grid.

No stones being thrown here!

VBCurtis 2018-08-16 01:11

On my home copies of lasieve4e, Qmax is near 1060M. I get errors above that and the sieve exits without trying any Q's. Perhaps someone might test-sieve Q=1100M or 1200M to see if that's even possible on the BOINC-ified sievers?

frmky 2018-08-16 07:04

[QUOTE=VBCurtis;493990]Perhaps someone might test-sieve Q=1100M or 1200M to see if that's even possible on the BOINC-ified sievers?[/QUOTE]
It's not for all 14e and 15e versions. IIRC the 64-bit Linux and FreeBSD clients will work, but the Mac and Windows ones will fail.

fivemack 2018-08-16 07:24

OK: my computers are doing the small gaps and the grid job L3655Ab is taking the strain. I will have to manage some of this on an iPad from the other side of the planet, which might be less efficient than the most efficient protocol but should work.

fivemack 2018-08-17 09:57

I have replaced L3655Ab with L3655Ac, which has a more sensible sieving region and, importantly, remembered to put the alim: and rlim: lines in the polynomial file.

fivemack 2018-10-13 10:57

Another data quality issue: 5009_73m1 only has 345131720 usable relations despite a claim of 450653495, and so I can't yet build a matrix. I will investigate which regions are missing and put in a new sieving job.


All times are UTC. The time now is 12:13.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.