mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   FactorDB (https://www.mersenneforum.org/forumdisplay.php?f=94)
-   -   downloading primes? (https://www.mersenneforum.org/showthread.php?t=20043)

mnh001 2015-02-06 00:04

downloading primes?
 
I've noticed on factordb that you can download primes but only 5000 at a time. It would
take forever to get them all that way. Is there a way to download all 100M+ at once? And in text format?

Uncwilly 2015-02-06 00:15

[QUOTE=mnh001;394615]I've noticed on factordb that you can download primes but only 5000 at a time. It would
take forever to get them all that way. Is there a way to download all 100M+ at once? And in text format?[/QUOTE]
To what end? The first few million can be rapidly generated when needed.

mnh001 2015-02-06 09:10

Just an experiment I'd like to try where I need all of them. I know I can generate the first few million easily but
to generate them all myself would also take forever.

Mini-Geek 2015-02-06 14:52

[QUOTE=mnh001;394680]Just an experiment I'd like to try where I need all of them.[/QUOTE]

There are an infinite number of primes (as you may know), and factorDB's are generally only of interest because they are known as factors of numbers. Why are you interested in factorDB's primes in particular? I suspect that you're not really, and are just asking the wrong question, which should be something like "how can I quickly generate a list of x primes with y-z digits?"

By the way, all of their primes are [URL="http://factordb.com/status.php"]126GB[/URL]. Are you really prepared to deal with that much data?

LaurV 2015-02-06 15:18

[QUOTE=Mini-Geek;394701]By the way, all of their primes are [URL="http://factordb.com/status.php"]126GB[/URL]. Are you really prepared to deal with that much data?[/QUOTE]
Huh? Only? :w00t:
You can generate 200GB of primes in a night**, with a pari line like "n=1; while(1,write("primefile.txt", n=nextprime(n+1)))" or something equivalent. In the morning your HDD will be full and the windoze stuck with "low disk space". :razz:

edit **well, joking apart, you would need SSD or a ramdisk program, I just tried and can't write more than 5-6MB/minute, so this would mean about 3-4GB all night, :redface:

danaj 2015-02-06 20:52

I used [FONT="Courier New"]forprime(p=2,10^8,write("primefile.txt",p))[/FONT] to an SSD and only get about 1.2 MB/s (5.8M primes in 42 seconds).

Using [FONT="Courier New"]perl -Mntheory=forprimes -E 'forprimes { say } 10**8' >primefile.txt[/FONT] completes in 2 seconds for 23.3 MB/s.

Using [FONT="Courier New"]primesieve -p1 1e8 >primefile.txt[/FONT] takes 4.7 seconds for 10.4 MB/s.

Using a trivial odd-only SoE in C plus buffered I/O takes only 0.51 seconds for 95.4 MB/s. About 73 MB/2 for 10^9.

The slowest prime generator is the fastest to output, while prime sieve (the fastest sieve by far) isn't that great for printing. Printing takes more time than sieving as long as it's reasonable.

mnh001 2015-02-06 23:41

Actually, my idea was a pre-test for factoring difficult composites.

A number may have a factor in common with another, which may have already been
done and entered in factordb. I did a speed test to try it out with the
numbers 2 to 100M in a text file. I chose a large prime (~30 digit) and
performed a mod test using the numbers in the text file.

It completed the test in 9 min 21 sec. Granted the primes in factordb are
a lot larger than my test numbers but I was just looking for a general
time frame.

If the computer is going to be spending weeks or months trying to factor
a difficult composite then doing this pre-test for 10 min or so could be
well worth it.

And since factordb already has primes over a very large range... who knows.

danaj 2015-02-07 00:01

Just for fun, I ran some more tests, this time to 10^10 (ASCII output is 4,948,214,537 bytes). Clearly this would be faster in binary, but the OP wanted text output, so why not. On a 1 month old Macbook Pro.

[code] 202.91 MB/s primegen + my buffered I/O
171.14 MB/s Perl/ntheory segmented sieve + my buffered I/O
62.24 MB/s Simple SoE + my buffered I/O
23.66 MB/s Perl/ntheory from Perl
18.27 MB/s primegen 0.97 from Atkin
10.94 MB/s primesieve 5.4.1[/code]Bernstein's code is slightly faster than my segmented SoE, and both are much faster than the trivial odd-only SoE. All those with same I/O routines. Standard Perl or Bernstein's putchar stuff is slower, and primesieve using standard printf really cuts down the speed since it is the fastest at actually generating.

Anyway, 200GB of primes shouldn't take too long to generate, albeit they're consecutive primes rather than the ones from factordb. For actual use in a factoring or primality routine, of course you wouldn't want them printed in an ASCII file, so all these numbers are moot (but it was fun to look at).


For pretest, I suggest using mpz_gcd with a primorial and/or a tree sieve. Jens Kruse Andersen has a nice description of the latter. It is quite a bit faster than using mpz_divisible_ui_p once above a certain size. I'm sure others will also have thoughts of time spent in trial division vs. time with Rho, p-1, and ECM -- those are all quite good at finding small factors, and for numbers you're spending months on, you've probably run them already, making small factors that slipped through very unlikely. For checking random large primes, unless there is some reason to suggest those numbers have a good chance of occurring, it seems quite wasteful.

mnh001 2015-02-07 00:08

True it probably is wasteful but since it will only take a very short time I figure it'll be worth it. I've spent
hours running yafu only to have it tell me that yes the number is composite.

Mini-Geek 2015-02-07 00:32

[QUOTE=mnh001;394759]Actually, my idea was a pre-test for factoring difficult composites.

A number may have a factor in common with another, which may have already been
done and entered in factordb. I did a speed test to try it out with the
numbers 2 to 100M in a text file. I chose a large prime (~30 digit) and
performed a mod test using the numbers in the text file.[/QUOTE]
A number is more likely to have smaller primes as factors. Thus, the most profitable primes to try are not "what's in factordb", but "the next smallest prime I haven't tried yet.
This is usually known as "trial factoring" (TF). As an early stage in finding primality or finding factors, it is the best. But you would not use it for large (e.g. >2^32) numbers.
[QUOTE=mnh001;394759]If the computer is going to be spending weeks or months trying to factor
a difficult composite then doing this pre-test for 10 min or so could be
well worth it.

And since factordb already has primes over a very large range... who knows.[/QUOTE]
Factordb has some large-ish primes, which are not useful for TF. It does not necessarily have all small primes.
With the right program, you can generate small primes faster than you can [I]read them from disk[/I], so besides being unwieldy to download or store such large files, it is not useful.

mnh001 2015-02-07 00:43

Yes, I know. But like I said this was just going to be an experiment. If I'm trying to factor HP49(119) (which has 2 small
factors 73 and 16249 plus a C251) I'm pretty sure the smallest factor of the c251 has quite a few digits.

What I'm gathering is that there isn't a way to get the factordb primes in one chunk. No problem. I'll just come up with
another experiment. :smile:


All times are UTC. The time now is 12:20.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.