mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   PrimeNet (https://www.mersenneforum.org/forumdisplay.php?f=11)
-   -   Quick Question about assignments (https://www.mersenneforum.org/showthread.php?t=16104)

chalsall 2011-11-15 17:42

[QUOTE=Mr. P-1;278509]I would be interested to see perl versions of both scripts.[/QUOTE]

Please provide a sample input file, and I'd be happy to produce the Perl...

(I always enjoy language wars... :smile::smile::smile:)

Mr. P-1 2011-11-15 20:38

[QUOTE=chalsall;278513]Please provide a sample input file, and I'd be happy to produce the Perl...

(I always enjoy language wars... :smile::smile::smile:)[/QUOTE]

[code]no factor for M51795493 from 2^69 to 2^70 [mfaktc 0.17-Win barrett79_mul32]
no factor for M51796097 from 2^69 to 2^70 [mfaktc 0.17-Win barrett79_mul32]
no factor for M51796097 from 2^70 to 2^71 [mfaktc 0.17-Win barrett79_mul32]
no factor for M51813887 from 2^69 to 2^70 [mfaktc 0.17-Win barrett79_mul32]
no factor for M51817247 from 2^69 to 2^70 [mfaktc 0.17-Win barrett79_mul32]
no factor for M51827527 from 2^69 to 2^70 [mfaktc 0.17-Win barrett79_mul32]
no factor for M51827537 from 2^69 to 2^70 [mfaktc 0.17-Win barrett79_mul32][/code]

There could also be factor found lines that don't match this format, but there aren't in the file I have to hand. Just assume that if the fourth word is M followed by a number, then it's in this format.

Mr. P-1 2011-11-15 22:19

[QUOTE=chalsall;278512]Indeed. Hash arrays (AKA Associative arrays) are an incredibly powerful tool in the programmer's tool chest.

If I can give a real world example (at least to we GIMPSers... :smile:)[/QUOTE]

Here's my AWK version of the same program:

[code]#! /usr/bin/awk -f

BEGIN {
FS=","
}

/^Factor=[^,]*,[0-9]*,[0-9]*,[0-9]*$/ {
if ( !($2 in from) || from[$2] > $3) from[$2] = $3
if ( !($2 in to) || to[$2] < $4) to[$2] = $4
}

END {
n = asorti(from, expo)
for (i = 1; i <= n; i++) for (j = from[expo[i]]; j < to[expo[i]]; j++) print "factor=" expo[i] "," j "," j+1
}[/code]

With the greatest of respect, I think it's more elegant than yours. There's no explicit "while" loop. AWK just "knows" that it has to loop on the input lines. Similarly there's no explicit reference to the input line. Again AWK just "knows" what the regexp has to match. Finally the "pattern" logic (the rexexp) is kept separate from the "action" logic, which you can't really do with perl.

Unfortunately the prog doesn't quite work. Here's the output:

[code]factor=11,7,8
factor=11,8,9
factor=11,9,10
factor=11,10,11
factor=2,1,2
factor=2,2,3
factor=2,3,4
factor=5,4,5
factor=7,5,6
factor=7,6,7[/code]

The problem is that AWK treats array indexes as strings, hence sorts "11" before "2". This is easy to fix:

[code]#! /usr/bin/awk -f

BEGIN {
FS=","
}

/^Factor=[^,]*,[0-9]*,[0-9]*,[0-9]*$/ {
expo[$2]=$2
if ( !($2 in from) || from[$2] > $3) from[$2] = $3
if ( !($2 in to) || to[$2] < $4) to[$2] = $4
}

END {
n = asort(expo)
for (i = 1; i <= n; i++) for (j = from[expo[i]]; j < to[expo[i]]; j++) print "factor=" expo[i] "," j "," j+1
}[/code]

However the mere fact that I've had to work around a langauge wart detracts from the sublime simplicity and elegance of the original.

chalsall 2011-11-16 00:06

[QUOTE=Mr. P-1;278548]no factor for M51795493 from 2^69 to 2^70 [mfaktc 0.17-Win barrett79_mul32]
no factor for M51796097 from 2^69 to 2^70 [mfaktc 0.17-Win barrett79_mul32]
no factor for M51796097 from 2^70 to 2^71 [mfaktc 0.17-Win barrett79_mul32]
no factor for M51813887 from 2^69 to 2^70 [mfaktc 0.17-Win barrett79_mul32]
no factor for M51817247 from 2^69 to 2^70 [mfaktc 0.17-Win barrett79_mul32]
no factor for M51827527 from 2^69 to 2^70 [mfaktc 0.17-Win barrett79_mul32]
no factor for M51827537 from 2^69 to 2^70 [mfaktc 0.17-Win barrett79_mul32][/QUOTE]

Given the above input, the following Perl code would produce the following output:

[CODE]#!/usr/bin/perl

while (<>) {
$_ =~ /.*M(\d*) from 2\^(\d.)* to 2\^(\d.)/;
$Exp{$1} = $1;
if ($From{$1} > $2 || !defined($From{$1})) { $From{$1} = $2; }
if ($To{$1} < $3 || !defined($To{$1})) { $To{$1} = $3; }
}

foreach $key (keys %Exp) {
print "Factor=${key},${From{$key}},${To{$key}}\n";
}[/CODE]

[CODE]Factor=51827527,69,70
Factor=51813887,69,70
Factor=51827537,69,70
Factor=51795493,69,70
Factor=51817247,69,70
Factor=51796097,69,71[/CODE]

chalsall 2011-11-16 00:12

[QUOTE=Mr. P-1;278568]With the greatest of respect, I think it's more elegant than yours. There's no explicit "while" loop. AWK just "knows" that it has to loop on the input lines. Similarly there's no explicit reference to the input line. Again AWK just "knows" what the regexp has to match. Finally the "pattern" logic (the rexexp) is kept separate from the "action" logic, which you can't really do with perl.[/QUOTE]

With the greatest of respect back, there is sometimes something to be said for readability vs. the fewest characters written. AWK is a declarative language. Much like Forth.

Perl is a procedural language. Much like C or C++ (or Java et al).

And, further, can you tell us all how you get a MySQL database handle in AWK?

Mr. P-1 2011-11-16 13:40

[QUOTE=chalsall;278579]With the greatest of respect back, there is sometimes something to be said for readability vs. the fewest characters written.[/QUOTE]

Readability is in the eye of the reader. I find the AWK versions more readable than the perl ones, because I'm more familar with the language, and probably also because I'm more familiar with my own coding style

[QUOTE]AWK is a declarative language. Much like Forth.

Perl is a procedural language. Much like C or C++ (or Java et al).[/QUOTE]

I don't see a lot of difference between them. In your script:

[code]$_ =~ /.*M(\d*) from 2\^(\d.)* to 2\^(\d.)/;[/code]

if I understand it correctly, tell it how to recognise and parse the input lines it wants to work on. That's declarative. The rest is procedural. The difference with AWK is that it provides separate pattern (declarative) and action (procedural) spaces.

[QUOTE]And, further, can you tell us all how you get a MySQL database handle in AWK?[/QUOTE]

No idea. Never had to do that. Not sure why I'd want to with AWK. It sounds more like a job for perl.

I'm not here arguing that AWK is a better language than perl. I took issue with the assertion by Christenson that "the right tool for this isn't sed, or awk". I think I've shown that it is. Of course, perl is also a fine tool for this task too.

Christenson also said "Chalsall's perl script would have been shorter than Mr P-1's awk script." I don't think that's been demonstrated either. It looks to me as though our scripts both have pretty much the same components, except that you had to tell perl to do some things (such as loop on the input lines) that AWK does without being told.

But that's only useful if what AWK does without being told is what you want it to do. AWK was designed to be very good a performing a particular task, namely reading records (by default, lines) parsing them into fields (by default white-space delimited), and then manipulating these in various not-too-complicated ways. The closer the actual problem fits this paradigm, the better language choice AWK will be.

So I would say that the real difference between AWK and perl is not that one is declarative and one procedural, it's that one is special purpose, while the other is general purpose.

chalsall 2011-11-16 15:33

[QUOTE=Mr. P-1;278663]Christenson also said "Chalsall's perl script would have been shorter than Mr P-1's awk script." I don't think that's been demonstrated either. It looks to me as though our scripts both have pretty much the same components, except that you had to tell perl to do some things (such as loop on the input lines) that AWK does without being told.[/QUOTE]

You are correct. Although my script was only 12 lines vs. your 22, mine contained 299 characters while yours contained 302 (with indentation removed from both scripts). Basically, identical.

[QUOTE=Mr. P-1;278663]So I would say that the real difference between AWK and perl is not that one is declarative and one procedural, it's that one is special purpose, while the other is general purpose.[/QUOTE]

Agreed. :smile:

garo 2011-11-16 22:12

[QUOTE=chalsall;278687]You are correct. Although my script was only 12 lines vs. your 22, mine contained 299 characters while yours contained 302 (with indentation removed from both scripts). Basically, identical.

<snip>
Agreed. :smile:[/QUOTE]

Boo! You call this a language war. You sissies :) We want a real fight!!

chalsall 2011-11-16 22:33

[QUOTE=garo;278796]Boo! You call this a language war. You sissies :) We want a real fight!![/QUOTE]

The shortest possible code in our modern age is management speak:

"Turn this into that." :smile:

Mr. P-1 2011-11-17 03:36

[QUOTE=Mr. P-1;278509]And here's the script so modifed. Admittedly asorta is a GAWK extension, so this would perhaps not be so easy with other versions.

[code]#! /bin/awk -f

BEGIN {
FS="[ M^]"
ORS="\r\n"
}

$5 in bita {
if (bita[$5] > $8) bita[$5] = $8
if (bitb[$5] < $11) bitb[$5] = $11
next
}

$5 ~ /[0-9]./ {
bita[$5] = $8
bitb[$5] = $11
}

END {
n = asorti(bita, expo)
for (i = 1; i <= n; i++) print "factor=" expo[i] "," bita[expo[i]] "," bitb[expo[i]]
}[/code][/QUOTE]

I've just spotted an error in the above. The pattern

[code]$5 ~ /[0-9]./[/code]

should have been

[code]$5 ~ /[0-9]+/[/code]

The same error is present in the earlier non-hashed script. dubslow, if you're actually using either of them, I suggest you correct it.

Mr. P-1 2011-11-17 05:00

[QUOTE=chalsall;278802]The shortest possible code in our modern age is management speak:

"Turn this into that." :smile:[/QUOTE]

The ideal computer language has just one statement:

"Do what I want".


All times are UTC. The time now is 01:23.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.