mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Aliquot Sequences (https://www.mersenneforum.org/forumdisplay.php?f=90)
-   -   A Tool for Studying Aliquot Sequence Data (https://www.mersenneforum.org/showthread.php?t=26643)

 EdH 2021-03-26 23:47

A Tool for Studying Aliquot Sequence Data

I've written a program to use Jean-Luc Garambois' "regina_file" to display various compilations of data, such as terminating primes, cycles and open-ended merges. The program is an amateur version of C++ that should compile with g++ under any linux OS. Feel free to modify in any manner and to provide critique.

Edit: Forgot to add the link to the page with the regina_file (Thanks Henry): [URL]http://www.aliquotes.com/aliquote_base.htm#alibasefonda[/URL]

[B]Edit2:[/B] I had to fix a bug that mishandled out of bounds entries. (New source provided at: [I][I]2021-03-27 at 08:38[/I])[/I]

An example of a run with the program named seqinfo (entries are highlighted):
[code]\$ ./seqinfo
Data available for sequences 2 through 14000000
Enter sequence (q to quit): [B]13456987[/B]
13456987 terminates with prime 13456987.
List all sequences that terminate with 13456987? (y/n/f): [B]y[/B]
13456987
1 sequence found.
Enter sequence (q to quit): [B]13930332[/B]
13930332 is open ended. It merges with 4788.
Enter sequence (q to quit): [B]284[/B]
284 ends in a cycle. Display cycle? (starts at entry point) (y/n/f): [B]y[/B]
284
220
Display all sequences that end in this cycle? (y/n/f): [B]y[/B]
220 284 562 80089 236883 438311 516551 543309
710637 850967 880847 1361207 1502809 1571111 1634599 1932199
2584945 2961887 3075271 3079297 3196247 3293689 3434375 3583153
4045895 4229785 4388329 5201945 5361407 5681929 7119911 8078687
8172965 8210735 8450785 8717711 8867495 9238099 9246265 9544409
9588729 9660965 9822335 9849451 10080601 10322491 10519607 10628569
10749529 10770475 11071655 11288185 11910911 12137673 12710149 13280375
13347719 13366297 13647655
59 sequences found.
Enter sequence (q to quit): [B]q[/B]
\$
[/code]The source code:
[code]
//////////////////////////////////////////////////////////////////////////////////
// This program is designed to display cumulative data for Aliquot sequences. //
// The program relies on a file named "regina_file" which must be retrieved and //
// unpacked from this file: regina_file.tar.lzma. The file can be retrieved //
// from this site: http://www.aliquotes.com/aliquote_base.htm#alibasefonda //
// The accuracy of this program is dependent on the currency of regina_file. //
// //
// To retrieve cycle information, the program has to access factordb.com, so a //
// network connection is also necessary. If no network is available, cycle //
// data will show as blank. wget must also be installed, which is commonly //
// available on most linux platforms. //
// //
// A "results.txt" file can be used to capture the output of various data sets. //
// This file is never deleted or overwritten by the program. It only appends //
// new data to its end. Therefore, it is necessary to manually remove the file //
// if old data is no longer wanted. //
// //
// Most functions are self-explanatory. When a (y/n/f) prompt is offered, the //
// "f" means to send data to the results.txt file as well as the screen. The //
// formatting is slightly different between them. The file output is displayed //
// with one element per line, while the screen output is displayed in columns. //
// The screen display is based on a screen width of 0 (mod 10). Any variation //
// of columns not equal to 0 (mod 10) will not be aligned properly. //
// //
// Compile with "g++ <filename> -o <program name>" //
//////////////////////////////////////////////////////////////////////////////////
#include <iostream>
#include <fstream>

using namespace std;

int main()
{
ifstream in, cyc;
ofstream out;
static string stype[15000000], extra1[15000000], cycles[100];
string buff, numt, temp, seqq, yn;
size_t found, foundc1, foundc2, foundc3, foundc4, founde, foundp;
int c, i=2, j, k, rcount, seqn;

in.open("regina_file");
if(in.is_open()){
cout << " Reading regina_file . . .\r";
fflush(stdout);
while(!in.eof()){
getline(in,buff);
found=buff.find("[");
if(found!=string::npos){
numt=to_string(i);
j=numt.length()+3;
temp.assign(buff.substr(j,1));
if(temp=="1")
stype[i].assign("p");
else if(temp=="0")
stype[i].assign("o");
else
stype[i].assign("c");

foundc1=buff.find(",");
if(foundc1!=string::npos){
temp.assign(buff.substr(foundc1+2));
foundc2=temp.find(",");
if(foundc2!=string::npos){
temp.assign(temp.substr(foundc2+2));
foundc3=temp.find(",");
if(foundc3!=string::npos){
temp.assign(temp.substr(foundc3+2));
foundc4=temp.find(",");
extra1[i].assign(temp.substr(0,foundc4));
}
}
}
}
i++;
}
in.close();
i--;
}
else{
cout << "An error occurred with regina_file. Is it present?" << endl;
return 0;
}
cout << "Data available for sequences 2 through " << i-1 << endl;

do{
cout << "Enter sequence (q to quit): ";
getline(cin, seqq);
if(seqq=="q")
return 0;
seqn=atoi(seqq.c_str());
if(seqn>1 && seqn<i){
if(stype[seqn]=="p"){
cout << seqn << " terminates with prime " << extra1[seqn] << "." << endl;
cout << "List all sequences that terminate with " << extra1[seqn] << "? (y/n/f): ";
getline(cin, yn);
if(yn.substr(0,1)=="y" || yn.substr(0,1)=="f"){
if(yn.substr(0,1)=="f"){
out.open("results.txt", std::ios_base::app);
out << "All sequences that terminate with " << extra1[seqn] << ":" << endl;
}
rcount=0;
for(j=2;j<i;j++){
if(extra1[j]==extra1[seqn]){
if(out.is_open())
out << j << endl;
temp.assign(to_string(j));
temp.append(" ");
temp=temp.substr(0,10);
cout << temp;
rcount++;
}
}
if(out.is_open())
out.close();
if(rcount>1)
cout << "\n" << rcount << " sequences found." << endl;
else
cout << "\n" << rcount << " sequence found." << endl;
}
}
else if(stype[seqn]=="o"){
cout << seqn << " is open ended. ";
if(to_string(seqn)==extra1[seqn]){
cout << "List any sequences that merge with " << seqn << "? (y/n/f): ";
getline(cin, yn);
if(yn.substr(0,1)=="y" || yn.substr(0,1)=="f"){
if(yn.substr(0,1)=="f"){
out.open("results.txt", std::ios_base::app);
out << "All sequences that merge with " << extra1[seqn] << ":" << endl;
}
rcount=0;
for(j=seqn+1;j<i;j++){
if(extra1[j]==extra1[seqn]){
if(out.is_open())
out << j << endl;
temp.assign(to_string(j));
temp.append(" ");
temp=temp.substr(0,10);
cout << temp;
rcount++;
}
}
if(out.is_open())
out.close();
if(rcount>1)
cout << "\n" << rcount << " sequences found." << endl;
else if(rcount==1)
cout << "\n" << rcount << " sequence found." << endl;
else
cout << "No merges found." << endl;
}
}
else
cout << "It merges with " << extra1[seqn] << "." << endl;
}
else if(stype[seqn]=="c"){
c=0;
cout << seqn << " ends in a cycle. Display cycle? (starts at entry point) (y/n/f): ";
getline(cin, yn);
if(yn.substr(0,1)=="y" || yn.substr(0,1)=="f"){
if(yn.substr(0,1)=="f"){
out.open("results.txt", std::ios_base::app);
out << seqn << " ends with the following cycle:" << endl;
}
temp.assign("wget \"http://www.factordb.com/elf.php?seq=");
temp.append(extra1[seqn]);
temp.append("&type=1\" -q -O cycle.tmp");
system(temp.c_str());
cyc.open("cycle.tmp");
if(cyc.is_open()){
while(!cyc.eof()){
getline(cyc, buff);
foundp=buff.find(".");
founde=buff.find("=");
if(founde!=string::npos && foundp!=string::npos){
cycles[c].assign(buff.substr(foundp+4,founde-foundp-5));
c++;
}
}
cyc.close();
c-=2;
for(j=0;j<c;j++){
if(out.is_open())
out << cycles[j] << endl;
cout << cycles[j] << endl;
}
if(out.is_open())
out.close();
}
else
cout << "An error was encountered trying to read cycle.tmp!" << endl;
}
cout << "Display all sequences that end in this cycle? (y/n/f): ";
getline(cin, yn);
if(yn.substr(0,1)=="y" || yn.substr(0,1)=="f"){
if(yn.substr(0,1)=="f"){
out.open("results.txt", std::ios_base::app);
out << "All sequences that end in the same cycle as " << extra1[seqn] << ":" << endl;
}
rcount=0;
if(c==0){
temp.assign("wget \"http://www.factordb.com/elf.php?seq=");
temp.append(extra1[seqn]);
temp.append("&type=1\" -q -O cycle.tmp");
system(temp.c_str());
cyc.open("cycle.tmp");
if(cyc.is_open()){
while(!cyc.eof()){
getline(cyc, buff);
foundp=buff.find(".");
founde=buff.find("=");
if(founde!=string::npos && foundp!=string::npos){
cycles[c].assign(buff.substr(foundp+4,founde-foundp-5));
c++;
}
}
cyc.close();
c-=2;
}
else
cout << "An error was encountered trying to read cycle.tmp!" << endl;
}
for(j=2;j<i;j++)
for(k=0;k<c;k++)
if(cycles[k]==extra1[j]){
if(out.is_open())
out << j << endl;
temp.assign(to_string(j));
temp.append(" ");
temp=temp.substr(0,10);
cout << temp;
rcount++;
}
if(out.is_open())
out.close();
if(rcount>1 || rcount<1)
cout << "\n" << rcount << " sequences found." << endl;
else
cout << "\n" << rcount << " sequence found." << endl;
}
}
else
cout << "Sequence appears to be incomplete!" << endl;
}
else
cout << "Sequence is outside current bounds of 2 through " << i-1 << endl;;
}while(seqq.substr(0,1)!="q");

return 0;
}
[/code]

 henryzz 2021-03-27 10:08

The regina_file mentioned above is at [url]http://www.aliquotes.com/aliquote_base.htm[/url]

 EdH 2021-03-27 12:20

[QUOTE=henryzz;574607]The regina_file mentioned above is at [URL]http://www.aliquotes.com/aliquote_base.htm[/URL][/QUOTE]
Thanks! I forgot to add the link to the page. The link I added to the first post will go to that section of the page.

 garambois 2021-03-27 17:58

Thank you very much Edwin !
Superb work !
It is for this purpose that I created the "regina_file" file a few years ago.
I wanted to be able to do that kind of research and statistics.

[U]I allow myself to provide some details here on this "regina_file" file.[/U]

The calculations are still going on and in a few months I will release the "regina_file" version up to 15,000,000.
I want to clarify that this file contains a lot more information about the aliquot sequences, but I dare not try here to do the English translation of the page which explains the meaning of all the variables contained in the file for each aliquot sequence.
For example, we can know for each sequence the number of parity changes, the number of peaks in the graph and other things that are useful to me.
And if your automatic translator tells you nonsense, ask me here for further explanation, I will try to explain things to you differently to make it easier to understand.
When using this file, always keep in mind that the computation of each sequence was interrupted as soon as the size of the terms first reached 10^50.
Some sequences considered as Open End in this file are therefore finished or merged in FactorDB.
"regina_cycles"
"regina_prems"
"regina_opens"
Finally, on [URL="http://www.aliquotes.com/aliquote_base_fond_applic.htm"]this other page[/URL], but sorry also written in French, I show what type of research we can do thanks to these 4 files (regina_file, regina_prems, regina_opens, regina_cycles).
Some records presented are quite funny.
Other example, we can look for sequences whose graph is in the shape of a "bell", these are sequences with a single maximum.
And other funny things ...

 EdH 2021-03-27 21:20

I'm glad you approve, Jean-Luc! I have already been studying the text you provide for elements of regina_file and am also already adding more features to the program. I'll be adding posts here with newer features as I add them. As can be seen in the source, I'm already set to accept a new file up to (almost) 15M. I will adjust that greater as the file is enlarged, as well.

I'm also looking over other portions of your pages. I forgot how much you had documented things within my interests. One of the things this program will soon be able to do is to provide a count of many of the things you are tracking.

I too find interesting things within this research, such as 28 being untouchable from any number but itself, while the other perfect numbers within the current set have many sequences that reach them:
[code]
Enter sequence (q to quit): [B]6[/B]
6 ends in a cycle. Display cycle? (starts at entry point) (y/n/f): [B]y[/B]
6
Display all sequences that end in this cycle? (y/n/f): [B]y[/B]
. . .
63275 sequences found.
Enter sequence (q to quit): [B]28[/B]
28 ends in a cycle. Display cycle? (starts at entry point) (y/n/f): [B]y[/B]
28
Display all sequences that end in this cycle? (y/n/f): [B]y[/B]
28
1 sequence found.
Enter sequence (q to quit): [B]496[/B]
496 ends in a cycle. Display cycle? (starts at entry point) (y/n/f): [B]y[/B]
496
Display all sequences that end in this cycle? (y/n/f): [B]y[/B]
496 608 650 652 790 1294 1574 1778
2162 2582 3142 5158 368449 1492799 1535075 1767455
1842215 2256401 2974751 3157729 3837505 3873551 4018945 4170127
4605213 4669921 5076873 5251285 5616985 6977649 7349365 7463965
7505901 7589845 7601365 7675345 8109041 8697385 8837245 8924241
11035163 12856335 13157075 13384167 13631207
45 sequences found.
Enter sequence (q to quit): [B]8128 [/B]
8128 ends in a cycle. Display cycle? (starts at entry point) (y/n/f): [B]y[/B]
8128
Display all sequences that end in this cycle? (y/n/f): [B]y[/B]
. . .
15492 sequences found.
[/code]

 henryzz 2021-03-27 21:33

 garambois 2021-03-28 08:34

Yes, Andrew R. Booker has explored the issue of "finite connected components of the aliquot graph".
His work is wonderful, the concept is very beautiful !

The study of the infinite graph of aliquot sequences has given rise to many questions and conjectures, in particular on the presence of all possible graphs in this graph.
You can get an idea of it by looking [URL="http://www.aliquotes.com/graphinfinisuali.htm"]here[/URL] and especially [URL="http://www.aliquotes.com/existence_graphes.pdf"]here[/URL] (We understand better when we see diagrams).
But once again, sorry, it's in French !

Everything becomes even more complicated if we decide to take into account the length of the branches of the graph.
It is this new research which made me, for example, ask you [URL="https://www.mersenneforum.org/showthread.php?t=25719"]this annoying question[/URL] on this forum, which still remains unanswered today.
But my computer is doing calculations right now ...

 garambois 2021-03-28 08:59

[QUOTE=EdH;574633]I'm glad you approve, Jean-Luc! I have already been studying the text you provide for elements of regina_file and am also already adding more features to the program. I'll be adding posts here with newer features as I add them. As can be seen in the source, I'm already set to accept a new file up to (almost) 15M. I will adjust that greater as the file is enlarged, as well.

I'm also looking over other portions of your pages. I forgot how much you had documented things within my interests. One of the things this program will soon be able to do is to provide a count of many of the things you are tracking.

I too find interesting things within this research, such as 28 being untouchable from any number but itself, while the other perfect numbers within the current set have many sequences that reach them:
[/QUOTE]

Thank you very much Edwin, it's great that you have decided to add in your program the possibility of exploiting more data from the regina_file file.
Personally, I will use your program quite often, that's for sure ...

Ideally, my program should run not by calculating the sequences on the machine, but by scanning the sequences on FactorDB.
Thus, the regina_file file and the three other complementary files would be much more complete.
But I think that we should only scan the sequences up to 10 M.
Because I am not sure that beyond that, someone has already done systematic calculations for all the sequences.
Indeed, if I enter 15,000,018 in FactorDB, the maximum term size is 50 digits.

 garambois 2021-03-28 09:21

I also want to take the opportunity here to ask a substantive question about the regina_file.
Each line in this file looks like this :

[n, a, b, c, d, e, f, g, h, i, j, k, l, m]

n is the starting number of the aliquot sequence.
k is the "arithmetic mean" (*) over all the terms of the quotients s[SUB]u+1[/SUB](n) / s[SUB]u[/SUB](n) (next term divided by the previous one).
However, it would seem that for mathematicians, it would be much easier to work with the "geometric mean" (*) of these quotients.
Now, the "geometric mean" of the quotients of the successive terms does not appear in the regina_file file.

[B]Would it be interesting to add a variable "o" which is the "geometric mean" of the quotients ?[/B]

If anyone has an opinion on this matter, it interests me the most !
In a few days or weeks of calculation, I could maybe add this geometric mean, but I'm not so sure, I still have to check.
And I will only do this job if I am told here that it would be useful.

(*) I hope the machine translation correctly translated "arithmetic mean" and "geometric mean"!

 Happy5214 2021-03-28 11:40

[QUOTE=garambois;574664]k is the "arithmetic mean" (*) over all the terms of the quotients s[SUB]u+1[/SUB](n) / s[SUB]u[/SUB](n) (next term divided by the previous one).
However, it would seem that for mathematicians, it would be much easier to work with the "geometric mean" (*) of these quotients.
Now, the "geometric mean" of the quotients of the successive terms does not appear in the regina_file file.

[B]Would it be interesting to add a variable "o" which is the "geometric mean" of the quotients ?[/B]

If anyone has an opinion on this matter, it interests me the most !
In a few days or weeks of calculation, I could maybe add this geometric mean, but I'm not so sure, I still have to check.
And I will only do this job if I am told here that it would be useful.

(*) I hope the machine translation correctly translated "arithmetic mean" and "geometric mean"![/QUOTE]

First of all, the translation came through correctly. Geometric means work better than arithmetic means for growth rates, which seem to be what the quotients are really serving as, so I think they're worth adding.

 garambois 2021-04-01 17:12

[QUOTE=Happy5214;574666]First of all, the translation came through correctly. Geometric means work better than arithmetic means for growth rates, which seem to be what the quotients are really serving as, so I think they're worth adding.[/QUOTE]

OK, I'll try to add that over the next few weeks.
I just realized that the geometric mean of the quotients of the successive terms can be written in a fairly simple way.
Indeed, if n is the starting number of the sequence, and if the calculation of the sequence was interrupted at index i, with the term s[SUB]i[/SUB](n), then the geometric mean x of the quotients of the successive terms is calculates as follows :
x = [s[SUB]1[/SUB](n)/n * s[SUB]2[/SUB](n)/s[SUB]1[/SUB](n) * s[SUB]3[/SUB](n)/s[SUB]2[/SUB](n) * ........ * s[SUB]i-1[/SUB](n)/s[SUB]i-2[/SUB](n) * s[SUB]i[/SUB](n)/s[SUB]i-1[/SUB](n)][SUP](1/i)[/SUP] = [s[SUB]i[/SUB](n)/n][SUP](1/i)[/SUP]
Only the starting number of the aliquot sequence, the last term and the number of terms are therefore involved in the calculation of this mean.

- For Open End sequences, no problem.

- For sequences that end with 1, it seems more interesting to me to consider the prime number as the last term of the sequence, rather than 1.

- For the sequences which end with a cycle, or which start with a number which belongs to a cycle, I don't know how to do it.
It seems reasonable to me to have a geometric mean of 1 for perfect numbers.
For amicable numbers, it's much more complicated.
If n = 220, the sequence is: 220 -> 284 -> 220 -> 284 ...
So I can take x = 284/220.
But I can also take x = (220/220) ^ (1/2) = 1.
For cycles longer than 2, it seems reasonable to go around the entire cycle before calculating the mean.
But if we proceed like this, if we take a number n which belongs to the cycle, the average will always be equal to 1.

I have to think about this cycle question, unless somebody here has a suggestion for me ?

All times are UTC. The time now is 18:06.