![]() |
|
|
#265 |
|
Mar 2003
New Zealand
13·89 Posts |
I have updated the remove-tpfactors awk script, it can now accept multiple sieve files as input, and so can be used to merge single-n NewPGen files into one multi-n file for tpsieve:
Code:
remove-tpfactors [FACTORFILE ...] INFILE [INFILE] > OUTFILE Also, tpsieve doesn't double-check factors as they are found, but you can use the check-factors program to do that. |
|
|
|
|
|
#266 |
|
Mar 2005
Internet; Ukraine, Kiev
11×37 Posts |
I use some shell, perl and pari magic:
sort_factors.sh Code:
#!/bin/bash set -e sort --ignore-leading-blanks --buffer-size=5% -n -t ' ' -k 1,1 "$@" | uniq ./sort_factors.sh tpfactors1.txt tpfactors2.txt ... > factors.txt This script expects tpsieve-format factor files as input. It sorts all factor files by n and removes duplicate factors. sort_numbers.sh Code:
#!/bin/bash set -e sort --ignore-leading-blanks --buffer-size=5% -n -t ' ' -k 2,2 -k 1,1 "$@" ./sort_numbers.sh num1.txt num2.txt > num.txt This script expects newpgen-format number files as input. It sorts all files by n, then by k. It leaves multiple header lines if they are different in source files. If that is the case, just use tail -n +N to fix that. get_p_range.pl Code:
#!/usr/bin/perl
use warnings;
use strict;
our $max_p_delta = 15000000;
our $prev_p = 0;
our $in_gap = 1;
while(<>)
{
$_ =~ tr/\r//d;
chomp($_);
if(m/^(\d+) /)
{
if($prev_p)
{
if($in_gap)
{
printf("range start: \%17s\n", $prev_p);
$in_gap = 0;
}
if(($1 - $prev_p) >= $max_p_delta)
{
my $delta = $1 - $prev_p;
printf("range end: \%17s\n", $prev_p);
printf("delta: \%17s\n\n", $delta);
$in_gap = 1;
}
}
$prev_p = $1;
}
else
{
print STDERR "What's \"$_\"";
}
}
printf("range end: \%17s\n", $prev_p);
./get_p_range.pl tpfactors.txt This script expects sorted tpsieve-format factor files as input. It prints ranges of p found in the file. If the gap between consequtive p's is greater then $max_p_delta, script treats it like a gap between ranges. So you should tweak that variable (because as p gets larger, factors are found less frequently). This script is useful to make sure we haven't missed anything. check_divisors.pl Code:
#!/usr/bin/perl
use warnings;
use strict;
while(<>)
{
$_ =~ tr/\r//d;
chomp($_);
if(m/^(\d+) \| (\d+)\*2\^(\d+)(-1|\+1)$/)
{
print("if((lift($2*Mod(2,$1)^$3$4)!=0),print(\"$_ bad\"))\n");
}
else
{
print STDERR "What's \"$_\"";
}
}
./get_p_range.pl tpfactors.txt | gp This script expects tpsieve-format factor files as input. It outputs a pari script to check factors. gen_delete_file.sh Code:
#!/bin/bash set -e cut -d '|' -f 2 "$@" | tr -d '\t\r ' | tr '*^+-' ' ' | cut -d ' ' -f 1,3 | sort --ignore-leading-blanks --buffer-size=5% -n -t ' ' -k 2,2 -k 1,1 | uniq ./gen_delete_file.sh tpfactors.txt > delete.txt This script expects tpsieve-format factor files as input. It outputs k,n pairs of factored numbers. Pairs are sorted by n, then by k (like number files). remove_factors.pl Code:
#!/usr/bin/perl
use warnings;
use strict;
die("usage: $0 numbers.txt factors.txt") if $#ARGV != 1;
open(NUMBERS, '<', $ARGV[0]) or die('can\'t open ' . $ARGV[0] . ': ' . $!);
open(FACTORS, '<', $ARGV[1]) or die('can\'t open ' . $ARGV[1] . ': ' . $!);
# leave header intact
print scalar <NUMBERS>;
my $factor_line = <FACTORS>;
my $number_line;
while($number_line = <NUMBERS>)
{
# if there are no more factors, just print the numbers
if(!defined($factor_line))
{
print $number_line;
print <NUMBERS>;
last;
}
my ($number_k, $number_n) = split(' ', $number_line);
my ($factor_k, $factor_n) = split(' ', $factor_line);
if(($number_n < $factor_n) ||
(($number_n == $factor_n) && ($number_k < $factor_k)))
{
print $number_line;
}
elsif(($number_n == $factor_n) && ($number_k == $factor_k))
{
$factor_line = <FACTORS>;
}
elsif(($number_n > $factor_n) && ($number_k > $factor_k))
{
die("factor \"$factor_line\" not found in numbers file")
}
}
close(NUMBERS);
close(FACTORS);
./remove_factors.pl numbers.txt delete.txt > new-numbers.txt This script expects as input a sorted newpgen-format number file and a delete.txt (from gen_delete_file.sh). It outputs a new number file with factored numbers removed. Works in one pass, uses little constant amount of memory so it works rather fast even with huge files. In all shell scripts above you can remove "--buffer-size=5%" argument to sort if you have much free RAM (you can get some speedup in that case). Last fiddled with by gribozavr on 2009-08-18 at 23:47 Reason: last-minute fix |
|
|
|
|
|
#267 | |
|
Feb 2007
211 Posts |
Quote:
|
|
|
|
|
|
|
#268 | |
|
Jun 2003
125308 Posts |
Quote:
* There will be some loss of efficiency. ** It is still advisable to split the file -- faster load up times. Last fiddled with by axn on 2009-08-19 at 03:11 |
|
|
|
|
|
|
#269 | |
|
Feb 2007
211 Posts |
Quote:
|
|
|
|
|
|
|
#270 |
|
I quite division it
"Chris"
Feb 2005
England
1000000111012 Posts |
Is it expected behaviour for the tpfactors.txt file to contain more than one entry for some k/n pairs?
e.g. 10000554877 | 998451*2^504623-1 23822842763 | 998451*2^504623+1 92136264541 | 998451*2^504623+1 (Windows, TPS 0.2.4 110809) edit: Maybe my fault. See below. Last fiddled with by Flatlander on 2009-08-19 at 18:34 |
|
|
|
|
|
#271 |
|
Jun 2003
23·683 Posts |
Were these from a single sieve session or from multiple sieve sessions? We know that tpsieve doesn't modify the sieve file for found factors. I _believe_ it does remove it from the in-memory bitmap, in which case, for the same sieve session, this shouldn't happen. Geoff/Ken_g6?
Last fiddled with by axn on 2009-08-19 at 18:15 |
|
|
|
|
|
#272 |
|
I quite division it
"Chris"
Feb 2005
England
207710 Posts |
A single sieve: "tpsieve -i allc.txt -p 10e9 -P 100e9"
edit: There are other double entries, not sure about more triple entries. edit: Yes, there are other triples. edit: Maybe I messed up with the sieving? I'd better run the range again. Sorry. edit: I don't see how I messed up. I stitched the files together from NewPGen, removed all the headers except the first, then took out the blank lines. That's how I ended up with allc.txt. Last fiddled with by Flatlander on 2009-08-19 at 18:40 |
|
|
|
|
|
#273 | |
|
Jan 2005
Caught in a sieve
5×79 Posts |
Quote:
Edit: Though I should mention, this could lead to different sieve file results depending on whether you interrupt the sieve, or how you break it up. Maybe that's why Geoff didn't do it before? Last fiddled with by Ken_g6 on 2009-08-19 at 19:29 |
|
|
|
|
|
|
#274 | |
|
Jun 2003
23×683 Posts |
Quote:
BTW, if the speed loss is less than 0.1% @ p = 10G, then I think it is worth it. Otherwise, not. Last fiddled with by axn on 2009-08-19 at 19:23 |
|
|
|
|
|
|
#275 |
|
Mar 2005
Internet; Ukraine, Kiev
40710 Posts |
(Just for statistics) We even have a number with 7 known factors:
23884746253 | 6257667*2^483102+1 41690700607 | 6257667*2^483102+1 109867225619 | 6257667*2^483102+1 292122145397 | 6257667*2^483102+1 381821327087 | 6257667*2^483102+1 540756549527 | 6257667*2^483102+1 1505506019099 | 6257667*2^483102+1 Last fiddled with by gribozavr on 2009-08-20 at 00:25 |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| S9 and general sieving discussion | Lennart | Conjectures 'R Us | 31 | 2014-09-14 15:14 |
| Sieving discussion thread | philmoore | Five or Bust - The Dual Sierpinski Problem | 66 | 2010-02-10 14:34 |
| Combined sieving discussion | ltd | Prime Sierpinski Project | 76 | 2008-07-25 11:44 |
| Sieving Discussion | ltd | Prime Sierpinski Project | 26 | 2005-11-01 07:45 |
| Sieving Discussion | R.D. Silverman | Factoring | 7 | 2005-09-30 12:57 |