mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   PARI/GP (https://www.mersenneforum.org/forumdisplay.php?f=155)
-   -   IntegerExponent Equivalent in Pari-GP (https://www.mersenneforum.org/showthread.php?t=21198)

science_man_88 2016-05-26 02:06

[QUOTE=CRGreathouse;434852]Well, the whole purpose of the thread was to share the kinds of tricks that let people like me do things like that! :smile:

Let me give a silly example. Suppose you wanted to build a list of all the numbers up to a billion that (1) were one more than a square and (2) were prime. You could write tests for each of these conditions, and then test them over each of those numbers:
[code]isOneMoreThanASquare(n)=issquare(n-1)
addhelp(isOneMoreThanASquare, "isOneMoreThanASquare(n): Returns 1 if n-1 is a square and 0 otherwise.")

\\isprime already exists

v=List();
for(n=1, 10^9, if(isOneMoreThanASquare(n) && isprime(n), listput(v, n)));
v=Vec(v); \\ converts from a list to a vector
#v \\ the number of terms found[/code]

But since you can iterate over both types of number, you might consider
[code]forprime(n=2, 10^9, if(isOneMoreThanASquare(n), listput(v, n)))[/code]
or
[code]for(k=1,sqrtint(10^9-1), n=k^2+1; if(isprime(n), listput(v, n)));[/code]

Stop to think about this for a moment -- don't try these yet. Which would work better, and why? Can you find a general rule about finding sequences with two particular properties (when you can loop over and test for membership of both)?

If you can, compare the expected time for these two approaches. Once you do, as a follow-up question, what can you do to make this code more efficient, by using the parity of primes greater than 2? How much of a difference does this make compared to choosing the better of the two methods?[/QUOTE][CODE]y=2;for(x=2,1000000000000,if(x%6==1 || x%6==5,print1(x","));x+=y;y+=2)[/CODE]

is my method to print all candidates under 1 trillion that are greater than 3 within 26 seconds for me. edit: of course adding in the isprime check takes the printing down below 4 seconds if I'm correct. okay now i tested the codes and see your versions blow mine out of the water. mine just equates to a forstep with a variable step. the rule would likely be to check the one that eliminates the most checks of the second.

CRGreathouse 2016-05-26 13:27

[QUOTE=science_man_88;434856][CODE]y=2;for(x=2,1000000000000,if(x%6==1 || x%6==5,print1(x","));x+=y;y+=2)[/CODE]

is my method to print all candidates under 1 trillion that are greater than 3 within 26 seconds for me. edit: of course adding in the isprime check takes the printing down below 4 seconds if I'm correct. okay now i tested the codes and see your versions blow mine out of the water. mine just equates to a forstep with a variable step. the rule would likely be to check the one that eliminates the most checks of the second.[/QUOTE]

The first problem is that you're modifying a loop variable -- this is inefficient with for loops and causes an error with most other loops. If you want to modify them directly (this is a good time for that) then while is a better choice.

The second problem is that you're printing. I/O takes a really long time and is almost never a good idea. It's much better to store the results you find in a variable, and then output them in whatever form is convenient (writing to a file, for example, if there are many).

The third problem is that you're doing a division at each step, even though the result is predictable. Divisions take a lot longer than additions and even multiplications.

The fourth problem... well, maybe not a problem, but a suggestion: rather than giving a huge number like 10^12, write a function with the huge number as a parameter (I usually call it "lim"). Then you can call the function with 10^12 or whatever you like, rather than modifying a huge single-line command.

[code]makeNumbers(lim)=
{
my(v=List(), y=3, x=2); \\ Always use "my" for variables for readability and speed
while(x <= lim,
if (gcd(x,6)==1, listput(v, x)); \\ test if x is 1 or 5 mod 6
x += y;
y += 2;
);
Vec(v);
}
addhelp(makeNumbers, "makeNumbers(lim): Return a vector with all the numbers from 1 to lim with a certain property known only to sm88.");[/code]

or better yet

[code]makeNumbers(lim)=
{
my(v=List(), y=4, x=5); \\ Always use "my" for variables for readability and speed
while(x <= lim,
listput(v, x);
x += y += 8; \\ rather than testing, find out what you will need to add and use only that
);
Vec(v);
}
addhelp(makeNumbers, "makeNumbers(lim): Return a vector with all the numbers from 1 to lim with a certain property known only to sm88.");[/code]

or even

[code]makeNumbers(lim)=vector(sqrtint((lim-1)\4), x, 4*x^2+1);
addhelp(makeNumbers, "makeNumbers(lim): Return a vector with all the numbers from 1 to lim with a certain property known only to sm88.");[/code]

or its relative

[code]makeNumbers(lim)=
{
my(s=1, t=-4);
vector(sqrtint((lim-1)\4),n, s += t += 8);
}
addhelp(makeNumbers, "makeNumbers(lim): Return a vector with all the numbers from 1 to lim with a certain property known only to sm88.");[/code]

science_man_88 2016-05-26 14:28

[QUOTE=CRGreathouse;434870]The first problem is that you're modifying a loop variable -- this is inefficient with for loops and causes an error with most other loops. If you want to modify them directly (this is a good time for that) then while is a better choice.

The second problem is that you're printing. I/O takes a really long time and is almost never a good idea. It's much better to store the results you find in a variable, and then output them in whatever form is convenient (writing to a file, for example, if there are many).

The third problem is that you're doing a division at each step, even though the result is predictable. Divisions take a lot longer than additions and even multiplications.

The fourth problem... well, maybe not a problem, but a suggestion: rather than giving a huge number like 10^12, write a function with the huge number as a parameter (I usually call it "lim"). Then you can call the function with 10^12 or whatever you like, rather than modifying a huge single-line command.

[code]makeNumbers(lim)=
{
my(v=List(), y=3, x=2); \\ Always use "my" for variables for readability and speed
while(x <= lim,
if (gcd(x,6)==1, listput(v, x)); \\ test if x is 1 or 5 mod 6
x += y;
y += 2;
);
Vec(v);
}
addhelp(makeNumbers, "makeNumbers(lim): Return a vector with all the numbers from 1 to lim with a certain property known only to sm88.");[/code]

or better yet

[code]makeNumbers(lim)=
{
my(v=List(), y=4, x=5); \\ Always use "my" for variables for readability and speed
while(x <= lim,
listput(v, x);
x += y += 8; \\ rather than testing, find out what you will need to add and use only that
);
Vec(v);
}
addhelp(makeNumbers, "makeNumbers(lim): Return a vector with all the numbers from 1 to lim with a certain property known only to sm88.");[/code]

or even

[code]makeNumbers(lim)=vector(sqrtint((lim-1)\4), x, 4*x^2+1);
addhelp(makeNumbers, "makeNumbers(lim): Return a vector with all the numbers from 1 to lim with a certain property known only to sm88.");[/code]

or its relative

[code]makeNumbers(lim)=
{
my(s=1, t=-4);
vector(sqrtint((lim-1)\4),n, s += t += 8);
}
addhelp(makeNumbers, "makeNumbers(lim): Return a vector with all the numbers from 1 to lim with a certain property known only to sm88.");[/code][/QUOTE]

I think the only reason yours are faster might be that lists are faster if I turn my original code with the prime check while loop I get at best equal with what I had if I do concat to a vector I get the time exploding from under 4 seconds to over 49. all I was using in the code I posted was literally that the difference to the next square is the next odd number. other than other people's codings of things I've almost never found my to be anything but a prettification waste of time ( I don't think I've made one code I can remember with it where it did anything except slow down the function) unless using parallel commands maybe. the division argument goes almost to null if you consider fast algorithms along with break even points versus gcd I bet.

CRGreathouse 2016-05-28 16:43

[QUOTE=danaj;434707]Now the question is what to do about it.[/QUOTE]

[QUOTE=CRGreathouse;434708]I submitted a wishlist bug report to the PARI developers. If we're lucky the GMP versions, at least, should get better conversion from (and to?) strings.[/QUOTE]

They fixed it!

[url]http://pari.math.u-bordeaux.fr/cgi-bin/gitweb.cgi?p=pari.git;a=commitdiff;h=2a4752f980620ab7633dc2a3ce15c65f2647d765[/url]

[url]http://pari.math.u-bordeaux.fr/cgi-bin/gitweb.cgi?p=pari.git;a=commitdiff;h=7be4fb329ed43ca172cd3b9d1828eb42dde54c71[/url]

On my machine this is a > 500x speedup at 10M digits.

CRGreathouse 2016-05-28 16:48

[QUOTE=science_man_88;434873]I think the only reason yours are faster might be that lists are faster if I turn my original code with the prime check while loop I get at best equal with what I had if I do concat to a vector I get the time exploding from under 4 seconds to over 49.[/QUOTE]

Yes, building a vector by concatenation takes quadratic time. In practice it's even worse because of cache effects (I think). If you're building up a collection of numbers one at a time this is exactly what List was designed to do.

[QUOTE=science_man_88;434873]the division argument goes almost to null if you consider fast algorithms along with break even points versus gcd I bet.[/QUOTE]

GCD is slower than division, but if you can get away with neither (as my later versions did) it's even better.

[QUOTE=science_man_88;434873]other than other people's codings of things I've almost never found my to be anything but a prettification waste of time ( I don't think I've made one code I can remember with it where it did anything except slow down the function)[/QUOTE]

my() can make code faster but the main reason to use it is correctness, not speed. Clobbering variables is not a good idea.

a1call 2016-05-28 17:37

[QUOTE=CRGreathouse;435018]

my() can make code faster but the main reason to use it is correctness, not speed. Clobbering variables is not a good idea.[/QUOTE]

It is risky to write a function using an un-localized variable. Say you reference the function in 6 months and are not sure if you have used a variable name in the function or not. A similar issue exists in my day-job as a Mechanical Designer. In parametric CAD design in a large assembly with many parts and sometimes in external references which are suppressed or not loaded using a name more than once can ruin months of complicated work. I have a personal system that I generate a unique ID for any part, file or drawing that is never repeated. They have the format [B]ABC-100-A[/B] that I keep a list of and increment with every use.
The system also has the added benefit that I can instantly locate any file or drawing by searching for its unique ID, which is virtually never repeated anywhere even in WWW.
It would be a useful tool if this could be automated. In Programming a function could parse a script and suffix a unique or large random text to all the variable names for example.

science_man_88 2016-05-28 17:57

[QUOTE=a1call;435019]It is risky to write a function using an un-localized variable. Say you reference the function in 6 months and are not sure if you have used a variable name in the function or not. A similar issue exists in my day-job as a Mechanical Designer. In parametric CAD design in a large assembly with many parts and sometimes in external references which are suppressed or not loaded using a name more than once can ruin months of complicated work. I have a personal system that I generate a unique ID for any part, file or drawing that is never repeated. They have the format [B]ABC-100-A[/B] that I keep a list of and increment with every use.
The system also has the added benefit that I can instantly locate any file or drawing by searching for its unique ID, which is virtually never repeated anywhere even in WWW.
It would be a useful tool if this could be automated. In Programming a function could parse a script and suffix a unique or large random text to all the variable names for example.[/QUOTE]

my() is lexically scoped local() is dynamically scoped (not that I completely remember what those mean).I'm just saying that there are different ones for different uses.

CRGreathouse 2016-05-29 02:15

There are three types of scope for variables in GP. Let me explain them like this: you are in a function outside() which will call a function inside().

First is global scoping. A variable which is globally scoped -- this happens by default, though there is a global() function -- allows both outside() and inside() to modify it. This way if you do
[code]x = aFunctionWhichRunsForWeeks();
outside()[/code]
you might find that x has been overwritten by either outside() or inside(), causing you to lose weeks worth of effort.

Second is lexical scoping. This is the one that the PARI developers recommend, and the one which causes gp to do the least work. A variable which is lexically scoped to outside(), meaning that it appears in a my() list in outside(), is private to that function. Neither inside() nor anything outside of outside() can modify it, so it's private.

Finally there is dynamic scoping. This is what you get from the local() function. A variable which is dynamically scoped to outside() doesn't affect the global variable of the same name, but it is visible to functions called by outside() like inside(). This is useful if you want to allow other functions access to some of your own variables, but it needs to be used carefully to make sure that those functions are only modifying it intentionally.

An example is in order:
[code]outside()=my(x); local(y); x = y = 2; t = inside(); concat(t, [x, y, z]);
inside()=my(x); x = y = z = 3; [x, y, z];
x = y = z = 1; t = outside(); concat(t, [x, y, z])[/code]
returns [3, 3, 3, 2, 3, 3, 1, 1, 3]: the inner function works as expected, the outer function keeps its own x value because of my(), while the others are changed, and in the global scope only the unprotected z is changed.

CRGreathouse 2016-05-29 02:16

Or, to summarize further: always use my, don't use local, and don't forget to use my.

science_man_88 2016-06-08 18:43

1 Attachment(s)
[QUOTE=CRGreathouse;435042]Or, to summarize further: always use my, don't use local, and don't forget to use my.[/QUOTE]

I'm back, did you miss me ? ... ( maybe don't answer that lol) I was bored and found a way to use local ( inline and my both didn't work for what I was doing). I made my own adjustable on the fly loops like a step loop with adjustable steps and even number of steps on the fly ( I think I've done that before, but it did have a limitation that it can only step loop roughly 2156 times ( so for 4 steps it can only go through them 539 times) before it crashes gp through the OS).

CRGreathouse 2016-06-08 19:55

[QUOTE=science_man_88;435834]I'm back, did you miss me ?[/QUOTE]

Welcome back! I wondered where you were.

[QUOTE=science_man_88;435834]I was bored and found a way to use local ( inline and my both didn't work for what I was doing). I made my own adjustable on the fly loops like a step loop with adjustable steps and even number of steps on the fly ( I think I've done that before, but it did have a limitation that it can only step loop roughly 2156 times ( so for 4 steps it can only go through them 539 times) before it crashes gp through the OS).[/QUOTE]

I recommend closures rather than strings for arbitrary code. Much better to do

[code]runThis(f, n)=f(n);
runThis(x->print(x), 7)[/code]
than
[code]runThis(s,n)=local(x=n); eval(s);
runThis("print(x)", 7)[/code]

This incidentally removes the need for local().

As for your particular case, isn't this what forstep is all about?
[code]forstep_sm88(code, startAt, endAt, step)=
{
forstep(n=startAt,endAt,step, code(n));
}
forstep_sm88(n->if(isprime(n), print(n)), 1, 100, 4) \\ prints primes which are 1 mod 4[/code]


All times are UTC. The time now is 06:54.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.