mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Data

Reply
 
Thread Tools
Old 2019-04-12, 20:00   #12
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

53478 Posts
Default

I think you made a mistake because I do not get any spikes when plotting number of factors with k=1 in bins of 1M, and it would not fit with the heuristic estimate of the number of sophie germain primes below n: 1.32032 * n / (ln n)^2
Attached Thumbnails
Click image for larger version

Name:	k=1.png
Views:	50
Size:	36.4 KB
ID:	20202  
ATH is offline   Reply With Quote
Old 2019-04-13, 17:59   #13
neomacdev
 
Apr 2019

2·32 Posts
Default

Quote:
Originally Posted by ATH View Post
I think you made a mistake because I do not get any spikes when plotting number of factors with k=1 in bins of 1M...
Yes. The plot certainly seemed odd, I suspected there might be a problem. It turned out to be the line endings in the file. Your file must be from Windows it had (\r\n) CR+LF style line endings. When I read it into Sage w/ Python on my Mac that was causing a problem. Once I stripped out the line endings, then I get a plot very similar to yours. Thanks AGAIN.
Attached Thumbnails
Click image for larger version

Name:	CountsOfFactorsForK=1PerMillionExponents.png
Views:	40
Size:	30.3 KB
ID:	20215  
neomacdev is offline   Reply With Quote
Old 2019-04-14, 00:06   #14
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

3·5·7·31 Posts
Default

Quote:
Originally Posted by axn View Post
Scraping the website for large data extraction is strongly discouraged as it puts unnecessary load on the servers. You can contact user madpoo who is the sysadmin with your exact request, and he will be able to either create a custom extract or point to an existing extract that will fulfill your need.
If the request won't take too much time, and you're willing to be patient until I can get to it, I can generally create a customized set of whatever data. Odds are you'll still get it before you can crawl it all from the website, and it'll be in some manageable form (csv, maybe xml, and I should play with exporting data as JSON at some point). And it keeps the server from getting hammered for days on end.
Madpoo is offline   Reply With Quote
Old 2019-04-23, 07:23   #15
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

1101100000012 Posts
Default

Quote:
Originally Posted by neomacdev View Post
ATH,

Looking at the factors where k=1 is actually something I wanted to do. When you pointed out that ~1,655,600 out of ~41,933,619 known factors were generated from k=1 that peaked my interested and wanted to look at how those factors were distributed by exponent range. So I made a little plot and right away noticed something interesting...

I grouped the exponents into ranges from 1-999999, 1000000-1999999, ..., 900000000-999999999, and then counted the number of occurrences of factors for k=1 amongst the exponents in each range, and made a plot. (attached)

There is a really interesting transition at exponents around 5 million, 50 million, and 500 million, where the frequency of factors for k=1 sharply increases before starting to slowly decrease again. I can't think of any logical reason for this behavior, but it will be something fun to look at more closely & try to explain.
This has to do with how you chose the bins. The binning you chose is apparently for each left digit of the exponent in decimal, over multiple powers of ten. Replot for any equal ratio of bin limits, that is, for n bins covering a power of ten, the ratio high limit to low limit of each bin is the nth root of ten, and it will become a rather smooth flat curve I think, with some statistical fluctuation present. This is another example of Newcomb-Benford's law https://en.wikipedia.org/wiki/Benford%27s_law. You're plotting number of k=1 factors in the chosen intervals, for f=2kp+1. So naturally the frequency per left-digit bin transitions (discontinuities) are at p~10n/2. Those chosen bin widths are different on a log scale. log (1.9999../1.000) =~ 0.30103 while log (9.9999../9.0000...) only ~0.04576, ~6.58:1.

Last fiddled with by kriesel on 2019-04-23 at 07:53
kriesel is offline   Reply With Quote
Old 2019-04-23, 14:12   #16
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(3,3^1118781+1)/3

11·19·43 Posts
Default

Quote:
Originally Posted by neomacdev View Post
I can't think of any logical reason for this behavior, but it will be something fun to look at more closely & try to explain.
Your sampling size at these boundaries changes by a factor of 10, so the curve has sharp bumps by a factor of 10. (The curve of expected densities is roughly logarithmic. And if you remove sampling bias, it will be.)
Batalov is offline   Reply With Quote
Old 2019-04-23, 14:48   #17
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

3·2,591 Posts
Default

Quote:
Originally Posted by Batalov View Post
Your sampling size at these boundaries changes by a factor of 10, so the curve has sharp bumps by a factor of 10. (The curve of expected densities is roughly logarithmic. And if you remove sampling bias, it will be.)
Quote:
Originally Posted by neomacdev View Post
Your file must be from Windows it had (\r\n) CR+LF style line endings. When I read it into Sage w/ Python on my Mac that was causing a problem. Once I stripped out the line endings, then I get a plot very similar to yours. Thanks AGAIN.
Serge, you are late to the party. The problem was already found out.
Uncwilly is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Possible values for k (in Mersenne factors) James Heinrich Math 127 2019-09-23 06:22
Factors found before ECM starts MisterBitcoin YAFU 1 2018-08-10 16:58
No factors found aketilander PrimeNet 9 2011-05-17 11:32
C program to rapidly verify all factors in GIMPS database GP2 Programming 8 2005-01-03 07:49

All times are UTC. The time now is 22:24.

Sat Apr 4 22:24:42 UTC 2020 up 10 days, 19:57, 1 user, load averages: 1.83, 2.05, 1.82

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.