mersenneforum.org > Data Error rate plot
 Register FAQ Search Today's Posts Mark Forums Read

2015-12-26, 23:17   #78
Serpentine Vermin Jar

Jul 2014

3,313 Posts

Quote:
 Originally Posted by henryzz If I had the data I could persuade R to produce a graph displaying this information.
What kind of data do you need specifically?

Something like a count of how many bad tests out of the total for each FFT size? I could do that I guess although Primenet doesn't track the FFT size that a test was done at. It would have to use the default FFT size that it would pick, but at those boundaries that wouldn't really be the size it actually used.

I'd probably have to dig up a list of what the FFT boundaries are. I'm sure there's one already out there somewhere?

 2015-12-27, 00:34 #79 Dubslow Basketry That Evening!     "Bunslow the Bold" Jun 2011 40
2015-12-27, 19:39   #80
henryzz
Just call me Henry

"David"
Sep 2007
Cambridge (GMT/BST)

172016 Posts

Quote:
 Originally Posted by Madpoo What kind of data do you need specifically? Something like a count of how many bad tests out of the total for each FFT size? I could do that I guess although Primenet doesn't track the FFT size that a test was done at. It would have to use the default FFT size that it would pick, but at those boundaries that wouldn't really be the size it actually used. I'd probably have to dig up a list of what the FFT boundaries are. I'm sure there's one already out there somewhere?
For starters I would need the data used to generate patrik's graphs in order to add vertical lines where fft boundaries lie. I assume that the exponents have been divided into chunks(by exponent range or by number of exponents) in order to work out the percentage with errors in each chunk. If you were to provide the data in fairly small chunks I can experiment with combining chunks to make a slightly smoother graph as appropriate.

In terms of fft lengths there are several different lists for different architectures. I would probably produce a graph for each architecture. Lots will have been done for each architecture so hopefully any patterns won't be masked too badly.
Approximate endpoints for each fft length would be needed to plot them.

What I suspect we might see is a spike just before some of the fft length boundaries. A graph like this might show that sort of thing up.

Something like csv or xls would be easy to import into R.

 2015-12-28, 00:42 #81 Madpoo Serpentine Vermin Jar     Jul 2014 CF116 Posts I'll toss out some random stats if it helps. As of right now, there are a total 66,089 known bad results. Here's the count broken down by 1e6 increments: Code: Range Bad Count 0 145 1000000 689 2000000 1181 3000000 1454 4000000 1839 5000000 1959 6000000 1835 7000000 2076 8000000 1942 9000000 1617 10000000 1725 11000000 1871 12000000 2064 13000000 2081 14000000 2174 15000000 2587 16000000 2291 17000000 2223 18000000 2101 19000000 2000 20000000 1857 21000000 1822 22000000 1892 23000000 1796 24000000 1704 25000000 1703 26000000 1845 27000000 1720 28000000 1550 29000000 1418 30000000 1444 31000000 1497 32000000 1392 33000000 2455 34000000 1821 35000000 1183 36000000 740 37000000 526 38000000 486 39000000 320 40000000 221 41000000 183 42000000 112 43000000 92 44000000 44 45000000 46 46000000 53 47000000 49 48000000 30 49000000 33 50000000 17 51000000 11 52000000 17 53000000 20 54000000 14 55000000 22 56000000 8 57000000 11 58000000 18 59000000 9 60000000 5 61000000 3 62000000 7 63000000 2 64000000 2 65000000 2 66000000 5 67000000 2 68000000 2 69000000 2 71000000 1 72000000 2 73000000 4 76000000 1 78000000 5 79000000 2 83000000 1 89000000 1 100000000 4 101000000 1 And the *known* good counts by the same range (sorry I didn't combine them into one table, I was being lazy). Recall that the # of known good results will be ~ double the # of actual exponents since they involved at least 2 matching tests (occasionally 3 matches when we did an independent 3rd check, but close enough to double): Code: Range Good Count 0 70168 1000000 62046 2000000 46320 3000000 45971 4000000 45905 5000000 45835 6000000 45965 7000000 44384 8000000 44549 9000000 44392 10000000 44168 11000000 44729 12000000 44542 13000000 44804 14000000 44836 15000000 45225 16000000 44981 17000000 44543 18000000 44848 19000000 43571 20000000 44136 21000000 44833 22000000 43910 23000000 43620 24000000 44025 25000000 43529 26000000 44084 27000000 43513 28000000 43598 29000000 42894 30000000 42772 31000000 42598 32000000 42296 33000000 41575 34000000 41730 35000000 27383 36000000 22151 37000000 15622 38000000 15559 39000000 13603 40000000 7398 41000000 5051 42000000 1912 43000000 1163 44000000 823 45000000 776 46000000 765 47000000 773 48000000 769 49000000 2231 50000000 446 51000000 570 52000000 314 53000000 588 54000000 505 55000000 580 56000000 275 57000000 389 58000000 1080 59000000 273 60000000 319 61000000 407 62000000 410 63000000 425 64000000 382 65000000 447 66000000 187 67000000 135 68000000 84 69000000 234 70000000 98 71000000 185 72000000 177 73000000 96 74000000 57 75000000 11 76000000 17 77000000 30 78000000 10 79000000 8 80000000 2 82000000 2 83000000 4 88000000 5 89000000 2 90000000 3 91000000 5 99000000 2 100000000 13 101000000 4 102000000 3 111000000 3 150000000 3 191000000 3 194000000 3 195000000 3 196000000 3 332000000 2 345000000 2 383000000 3 Last fiddled with by Madpoo on 2015-12-28 at 00:43
 2015-12-28, 13:00 #82 patrik     "Patrik Johansson" Aug 2002 Uppsala, Sweden 52×17 Posts I updated the files that the links in posts #28 and #29 point to when I made the new plot. The C program I use is unchanged, so if you have access to a Linux system (with awk and gcc) you can use that program. Instructions: Download the zip files and the source. Compile the source. Code: pjoh@kappa:~/Error_rate/test> cc error_rate_v6.c -o error_rate_v6 Unzip the data files. Code: pjoh@kappa:~/Error_rate/test> unzip Nbad.zip Archive: Nbad.zip inflating: nbad.txt pjoh@kappa:~/Error_rate/test> unzip Nhrf3.zip Archive: Nhrf3.zip inflating: nhrf3.txt pjoh@kappa:~/Error_rate/test> unzip Nlucas_v.zip Archive: Nlucas_v.zip inflating: nlucas_v.txt Start the program from the directory where the data files are, possibly redirecting the output to a file of your choice. If you put the binary in the same directory, it will look like Code: pjoh@kappa:~/Error_rate/test> ./error_rate_v6 > error_rates_50k_zero.txt You can also change the width of the classes by giving it as the only argument. Code: pjoh@kappa:~/Error_rate/test> ./error_rate_v6 10000 | head 1 1 281 0 0 1 0 10001 1 397 0 0 1 0 20001 0 384 0 0 0 0 30001 0 536 0 0 0 0 40001 0 508 0 0 0 0 50001 0 560 0 0 0 0 60001 0 464 0 0 0 0 70001 0 432 0 0 0 0 80001 0 532 0 0 0 0 90001 0 572 0 0 0 0 The columns mean: Start of range. # of bad tests. # of verified tests. # of unverified tests exceeding one per exponent. # of unverified tests. # of bad tests with zero error code. # of unverified tests with zero error code exceeding one per exponent. Before I make the plot I then use a text editor to manually remove the end of the file (to which I redirect the output) where there are so few tests that the scatter is large.
 2015-12-30, 06:51 #83 Madpoo Serpentine Vermin Jar     Jul 2014 3,313 Posts By the way, in case you wanted this info to help track things... I posted tables that had the good/bad counts broken down by 1M increments, so here are the current counts of unknown and suspect. Unknown: Code: 1M_Range UnkCount 34000000 41 35000000 7227 36000000 9643 37000000 13156 38000000 13290 39000000 14065 40000000 16954 41000000 18103 42000000 19453 43000000 20046 44000000 19815 45000000 20065 46000000 20009 47000000 19946 48000000 19983 49000000 19299 50000000 19870 51000000 19587 52000000 19928 53000000 19752 54000000 19640 55000000 19757 56000000 20041 57000000 19681 58000000 19465 59000000 19787 60000000 19638 61000000 19524 62000000 19324 63000000 19349 64000000 19239 65000000 19066 66000000 18754 67000000 13506 68000000 4993 69000000 9261 70000000 5105 71000000 5758 72000000 11667 73000000 4350 74000000 3635 75000000 1843 76000000 2073 77000000 1362 78000000 824 79000000 478 80000000 60 81000000 20 82000000 6 83000000 4 85000000 2 86000000 8 88000000 24 89000000 3 90000000 2 91000000 4 92000000 8 93000000 2 95000000 10 98000000 3 99000000 1 100000000 30 101000000 1 102000000 1 103000000 6 109000000 2 111000000 2 112000000 1 113000000 1 116000000 1 120000000 1 121000000 1 122000000 2 123000000 2 125000000 2 128000000 1 130000000 1 131000000 1 147000000 1 150000000 2 165000000 1 167000000 2 177000000 1 179000000 1 222000000 1 265000000 1 270000000 1 322000000 1 332000000 61 333000000 6 399000000 1 Suspect: Code: 1M_Range SusCount 35000000 120 36000000 210 37000000 295 38000000 300 39000000 297 40000000 361 41000000 328 42000000 369 43000000 293 44000000 314 45000000 256 46000000 275 47000000 239 48000000 228 49000000 190 50000000 185 51000000 128 52000000 125 53000000 106 54000000 67 55000000 58 56000000 66 57000000 76 58000000 58 59000000 67 60000000 66 61000000 66 62000000 53 63000000 57 64000000 46 65000000 44 66000000 53 67000000 74 68000000 60 69000000 75 70000000 47 71000000 22 72000000 58 73000000 31 74000000 31 75000000 17 76000000 11 77000000 16 78000000 11 79000000 2 88000000 1 123000000 1 332000000 3 340000000 1 595000000 1
 2016-01-03, 23:55 #84 henryzz Just call me Henry     "David" Sep 2007 Cambridge (GMT/BST) 25×5×37 Posts I have managed to get the c program to compile under cygwin. Ran into a couple of problems: Discovered my PATH variable was null. After a bit of googling I discovered that this was because it had exceeded 2047 characters. The index function is no longer available. Added #define index(a,b) strchr((a),(b)) to make it compile. Will get back to this when I have time. Have an exam coming up that takes priority(which probably means that this will get done quickly) Still need a list of fft lengths and their approx max n.
2016-01-04, 00:22   #85
chalsall
If I May

"Chris Halsall"
Sep 2002

37·269 Posts

Quote:
 Originally Posted by henryzz The index function is no longer available. Added #define index(a,b) strchr((a),(b)) to make it compile.
Wow! Really?

Are arrays really that difficult under C?

2016-01-04, 01:37   #86
henryzz
Just call me Henry

"David"
Sep 2007
Cambridge (GMT/BST)

25·5·37 Posts

Quote:
 Originally Posted by chalsall Wow! Really? Are arrays really that difficult under C?
???

Also discovered that an escape character ^ was needed in the awk commands for >.

 2016-01-04, 16:14 #87 henryzz Just call me Henry     "David" Sep 2007 Cambridge (GMT/BST) 25×5×37 Posts I found fft lengths within http://www.mersenneforum.org/showpos...2&postcount=35 This is only the old x86 and sse2 boundaries but it is better than nothing. Anything after that is probably going to be less clear anyway as I think there are more fft lengths these days. If anyone has further suggestions/ideas for graphs give be a shout. Attached Thumbnails
2016-01-04, 18:06   #88
chalsall
If I May

"Chris Halsall"
Sep 2002

37·269 Posts

Quote:
 Originally Posted by henryzz ???
Sorry... I was trying to make a joke. As has been previously noted, I often fail...

Arrays are a pain in the ass under C. Even one dimensional arrays have a tendency to overrun. Multi-dimensional arrays requires some creativity.

 Similar Threads Thread Thread Starter Forum Replies Last Post ixfd64 Hardware 4 2011-04-12 02:14 S485122 PrimeNet 15 2009-01-16 11:27 GP2 Data 3 2003-12-01 20:24 dsouza123 Data 6 2003-10-23 22:26 GP2 Data 5 2003-09-15 23:34

All times are UTC. The time now is 22:20.

Thu Oct 21 22:20:48 UTC 2021 up 90 days, 16:49, 1 user, load averages: 1.04, 1.11, 1.59