mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > PrimeNet

Reply
 
Thread Tools
Old 2011-11-14, 18:53   #265
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

9,767 Posts
Default

Quote:
Originally Posted by Mr. P-1 View Post
I disagree. This task lies right in AWK's sweet spot. Dublsow, however (no disrespect) didn't do it the AWK way.
As we quickly learn in computer science and the maths (and probably other domains as well)...

There is often more than one way to solve a problem....
chalsall is offline   Reply With Quote
Old 2011-11-14, 19:01   #266
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

To all of the above: I put that script together after about 30-60 minutes of going through here. I don't know an ounce of programming besides. I literally know no perl, and the only awk I know is what I posted. My only experience in anything related is a novice level of bash, if that.

(Also I did think of a way to do what I wanted within what I know, but your method does look cleaner Mr. P-1.) But this will have to wait anyways, I need to go eat lunch. Also, see http://www.mersenneforum.org/showthread.php?t=16228 for why I'm doing this; there's around 2000 TF that needs to be done, and I'd like some help.
Dubslow is offline   Reply With Quote
Old 2011-11-15, 03:00   #267
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

1C3516 Posts
Default

Quote:
Originally Posted by Mr. P-1 View Post
Code:
#!/bin/awk -f

BEGIN           {
                FS="[ M^]"
                ORS="\r\n"
                }

$5 == expo      {
                bitb = $11
                next
                }

$5 ~ /[0-9]./   {
                if ( expo ) print "Factor=" expo "," bita "," bitb
                bita = $8
                bitb = $11
                expo=$5
                }

END             {
                if ( expo ) print "Factor=" expo "," bita "," bitb
                }
In support of learning something: $5 == expo and $5 ~ regex are conditions attached to running the following functions? ( expo ) only tests if that variable is defined? What does it test? And how exactly does 'next' work? (I can guess it's purpose but want to be sure of details.)

Also, what's a hash?
Also also, how could this be modified to output to exponent order? It gets harder and harder to compare lines further and further away, and to order it you'd need to check the whole file...

Last fiddled with by Dubslow on 2011-11-15 at 03:35
Dubslow is offline   Reply With Quote
Old 2011-11-15, 03:43   #268
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

32·29·37 Posts
Default

If the end of the week is not too late, then send (PM) me a bunch of, say, 100 to 500 of them, in mfaktc input format. I am going to try to play with mfaktc for the first time in my life during the coming weekend and I think that is better to have some input to test, i.e. to do some useful work instead of playing around, but I don't want to be forced to figure out how to transform a list of numbers into a format that mfaktc would understand (I know nothing about mfaktc yet). You save me of this pain and I save you of a part of exponents.
LaurV is online now   Reply With Quote
Old 2011-11-15, 04:09   #269
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

32×29×37 Posts
Default

Quote:
Originally Posted by Dubslow View Post
Also, what's a hash?
Imagine you have a million nice colored little balls in a bag and you see a new one with a nice red and blue pattern in the shop, but you only want to buy it if you do not have one like that already. Then you have to go home and see if you have it already, but you can't search your bag, ball by ball, this will take ages, but you also can't take the ball with you to compare. And remembering the color's pattern is impossible for a normal man. Then you invent a method (function, or hash) to codify the color patterns, in a way that is easy for you to remember, and in a way that can also be "sorted" at a glance. For example you associate a digit to each color, red=1, orange=2, etc, a digit to each pattern, triangles=1, diamonds=2, leaves=3, etc. Then use 3 digits for colors, from the one most dominant to the one less dominant, two digits for patterns. Then the ball 32147 is easy to remember. Then you go home and look in your big bag and sort your balls in 10 small bags, all balls starting with 1 in the first bag, all starting with 2 in the second, etc. Then you repeat for each small bag: put 10 boxes inside of each bag, and sort in the first box all the balls which have the second digit 1, put in the second bag all the balls having the second digit 2, etc. Do this for each small bag, eventually go one step deeper adding 10 little boxes inside of each box (that is inside of each bag) and sort for the third digit.

Congratulation, you just made a hash table. Now it is quite easy to search for a ball: from the first three digits you know exactly which small box to look into, and in that box there are not so many balls, you can look to them one by one.

The hash variables are not different. They make "hashes" of the lines automatically, and for each line they read (this is the new ball you see on the shop), the value is hashed and compared to what you already read. If the hash match, then the value is checked, and if it matches, then the function (pattern) is applied. This makes sense because checking the hash is much faster then checking the value/pattern/function/etc effectively. The method is used for complicate data that you can not sort (if you can sort it, then a quick search or binary search will be much faster) or for which the comparison is complicate (as comparing the patterns, you better make hashes -functions- of patterns and compare the hashes -values of the fucntions). In fact, it is exactly same amount of work as you said, "checking the whole file", but this is done faster, optimal, and/or with only one read, etc, because somebody else did the work already (implemented the hash variables), hash tables are short and can be kept in memory, etc.

For details see here.

Last fiddled with by LaurV on 2011-11-15 at 04:17
LaurV is online now   Reply With Quote
Old 2011-11-15, 04:21   #270
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

111000000112 Posts
Default

Laurv:
The mfaktc parser isn't difficult. It's in C, and straightforward. But don't put too much work into it, as I have a re-write waiting for me to test it that is a little smarter...I set it up so there is exactly one interpreter of the lines in worktodo.txt, and so assignments are beginning to be handled as structs. The subtlety here is that worktodo has to be edited when an assignment is done, and anything skipped over in finding the assignment has to be ignored silently.

Definitions, from me (and Chalsall can correct them):
Hash := associative hash := store (key, value) pairs retrievable by key := content-addressable storage
Regular Expression (Regex): Powerful syntax for dissecting and re-assembling strings as needed.

Chalsall's perl script would have been shorter than Mr P-1's awk script. I don't do perl often enough to do it without the book, being a classical C programmer during the day. Also, a "real" text editor would be able to be programmed quickly to do the editing you need on a one-time basis. I personally use Codewright, (now from Borland, has a small following) but this is not open source. It's worth learning enough emacs or other editor to learn how to do this, because you need this kind of editing from time to time.
Christenson is offline   Reply With Quote
Old 2011-11-15, 04:46   #271
davieddy
 
davieddy's Avatar
 
"Lucan"
Dec 2006
England

145128 Posts
Default Skinning a cat

Quote:
Originally Posted by chalsall View Post
As we quickly learn in computer science and the maths (and probably other domains as well)...

There is often more than one way to solve a problem....
I related this anecdote to Ernst the other week, and can't
resist retelling it now.

About 15 years ago I had to decompress JPEGs on a 386.
I devized a fast inverse discrete cosine transform which used
only 11 multiplies. I had written it out in BASIC.
Knowing that a guy called Chen had done similar, we
checked it out. It was displayed in C, and it was identical
(verbatim) to mine.

In the way smart colleagues do, my mate Steve casually
remarked:
"Just goes to show there's only one way to skin a cat."

Happy Days

David
davieddy is offline   Reply With Quote
Old 2011-11-15, 07:06   #272
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

Oh also Mr. P-1 my work that you assigned me a while ago was completed earlier today.
Dubslow is offline   Reply With Quote
Old 2011-11-15, 16:33   #273
Mr. P-1
 
Mr. P-1's Avatar
 
Jun 2003

7·167 Posts
Default

Quote:
Originally Posted by Dubslow View Post
In support of learning something: $5 == expo and $5 ~ regex are conditions attached to running the following functions?
Basically yes. In the vernacular of AWK, these are "patterns", while the associated function is the corresponding "action".

Quote:
( expo ) only tests if that variable is defined? What does it test?
It actually tests if it evaluates to true. Undefined evaluates to false. 0 is false. "" is false. any other number or non-blank string is true.

Quote:
And how exactly does 'next' work? (I can guess it's purpose but want to be sure of details.)
Proceed to the next line of input (or END if there is no more input) do not parse any more patterns or actions for this line.

Quote:
Also, what's a hash?
Most programming languages (and all useful ones!) allow you define arrays of values, usually indexed by small integers. An associative array or hash is similar, except that it can be indexed by arbitrary values defined by the programmer.

Quote:
Also also, how could this be modified to output to exponent order? It gets harder and harder to compare lines further and further away, and to order it you'd need to check the whole file...
No, you use a hash indexed by the exponent. You don't compare lines. You check to see if current exponent is already an index in the hash.

Last fiddled with by Mr. P-1 on 2011-11-15 at 16:33
Mr. P-1 is offline   Reply With Quote
Old 2011-11-15, 17:08   #274
Mr. P-1
 
Mr. P-1's Avatar
 
Jun 2003

7·167 Posts
Default

And here's the script so modifed. Admittedly asorta is a GAWK extension, so this would perhaps not be so easy with other versions.

Code:
#! /bin/awk -f

BEGIN           {
                FS="[ M^]"
                ORS="\r\n"
                }

$5 in bita      {
                if (bita[$5] > $8) bita[$5] = $8
                if (bitb[$5] < $11) bitb[$5] = $11
                next
                }

$5 ~ /[0-9]./   {
                bita[$5] = $8
                bitb[$5] = $11
                }

END             {
                n = asorti(bita, expo)
                for (i = 1; i <= n; i++) print "factor=" expo[i] "," bita[expo[i]] "," bitb[expo[i]]
                }
I would be interested to see perl versions of both scripts.
Mr. P-1 is offline   Reply With Quote
Old 2011-11-15, 17:39   #275
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

9,767 Posts
Default

Quote:
Originally Posted by Mr. P-1 View Post
Most programming languages (and all useful ones!) allow you define arrays of values, usually indexed by small integers. An associative array or hash is similar, except that it can be indexed by arbitrary values defined by the programmer
Indeed. Hash arrays (AKA Associative arrays) are an incredibly powerful tool in the programmer's tool chest.

If I can give a real world example (at least to we GIMPSers... )

Say you had a worktodo.txt file like (note the exponents are out of order, and the duplicate exponent "2"):

Code:
Factor=N/A,11,7,11
Factor=N/A,2,1,2
Factor=N/A,7,5,7
Factor=N/A,5,4,5
Factor=N/A,2,3,4
But you preferred to do each "bit" level individually, and from the lowest exponent up. The following code could do the transform for you:

Code:
#!/usr/bin/perl

while (<>) {
   $_ =~ /Factor=([^,]*),(\d*),(\d*),(\d*)/;

   $AID{$2} = $1;

   if ($From{$2} > $3 || !defined($From{$2})) { $From{$2} = $3; }
   if ($To{$2}   < $4 || !defined($To{$2}))   { $To{$2}   = $4; }
}

sub sort_by_number { $a <=> $b; }

foreach $key (sort sort_by_number keys %AID) {
   for ($cnt = $From{$key}; $cnt < $To{$key}; $cnt++) {
      $NewTo = $cnt+1;
      print "Factor=${AID{$key}},${key},${cnt},${NewTo}\n";
   }
}
If you put this Perl code into a file called stepbystep.pl and made it "executable", you could then at the command line execute:

Code:
./stepbystep.pl < worktodo.txt >new_worktodo.txt
This would result in new_worktodo.txt containing:

Code:
Factor=N/A,2,1,2
Factor=N/A,2,2,3
Factor=N/A,2,3,4
Factor=N/A,5,4,5
Factor=N/A,7,5,6
Factor=N/A,7,6,7
Factor=N/A,11,7,8
Factor=N/A,11,8,9
Factor=N/A,11,9,10
Factor=N/A,11,10,11
Note that an Associated array (in Perl) is referenced by a % symbol (e.g. "%AID" in the above), but the value of the array based on the key is referenced by, in the above example, "$AID{$key}" (replace "$key" with whatever variable you'd like).

One of the reasons Associative arrays are so powerful, beyond the speed, is they are effectively "sparse arrays". As in, the memory consumed is a function of the number of elements, not the minimum and maximum values of the keys. In Perl, executing the two statements "$Array[1] = 1; $Array[1000000000]=1" will result in an "Out of memory error" on most computers because the interpreter will try to allocate an array with one billion rows only to store two values.

Does that help at all, or only confuse you more?
chalsall is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
A quick question Pegos Information & Answers 6 2016-08-11 14:39
Quick TF Question Dubslow GPU Computing 2 2011-10-27 04:49
Quick msieve question alkirah Msieve 2 2009-12-30 14:00
Quick question about P90 CPU metric stars10250 PrimeNet 9 2008-08-31 23:58
Quick p-1 question Unregistered Software 8 2006-10-13 23:35

All times are UTC. The time now is 11:17.


Mon Aug 2 11:17:35 UTC 2021 up 10 days, 5:46, 0 users, load averages: 1.05, 1.06, 1.24

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.