mersenneforum.org PrimeNet reports: consistent column widths please
 Register FAQ Search Today's Posts Mark Forums Read

 2011-09-25, 22:14 #1 James Heinrich     "James Heinrich" May 2004 ex-Northern Ontario 66368 Posts PrimeNet reports: consistent column widths please I've noticed on a couple PrimeNet reports, such as: http://v5www.mersenne.org/report_recent_results/ http://v5www.mersenne.org/report_top_500_P-1/ etc that 99% of the report is nicely formatted in columns, but the occasional user/computer name will exceed the report format limits and throw everything out of alignment. The reports are formatted for 20-char usernames and 16-char computer names (where applicable). Sometimes the usernames are just particularly long (e.g. "Lasers and plasmas in Bordeaux" = 30 chars), but in other cases the name is not long at all (e.g. "Smok_bmv") but spacing is still messed up. Can I please request that (for report display purposes) all usernames are forcibly truncated and/or padded to 20 chars, and all computer IDs forcibly truncated and/or padded to 16 chars?
 2011-09-26, 00:41 #2 cheesehead     "Richard B. Woods" Aug 2002 Wisconsin USA 769210 Posts What, exactly, is the real harm done, by displaying the full proper names of users and their computers, that is greater than the offense your proposal would give to long-named folks by omitting part of their perfectly normal names? You admit that 99% of the reports is nicely formatted. What real harm does the nonconforming 1% do? Is it the extra work your eye muscles perform when reading Lawrence V. Castiglione III's entry in the Top 500 P-1 report? (In general, I don't like Procrustean suggestions that people should be made to neatly fit computers in some regard. People aren't numbers! )
 2011-09-26, 01:30 #3 James Heinrich     "James Heinrich" May 2004 ex-Northern Ontario 1101100111102 Posts I can read the report just fine, it's when it comes time to try and automatically parse the report that problems arise. If the report is presented in fixed-width format, it should stick to that, but I would be delighted to see the reports presented in fuller detail in a delimited format (tabs charcters between fields, for example), as that would make parsing even easier than it is now. Even to the human eye, there are some data that requires effort to interpret. For example, if a computer name is just the right length, it can abut the exponent in the report, which is fine until you have trailing digits in the computer name. This sample line is perfectly readable: Code: linded skynet-node052 917192671 F Sep 26 2011 12:25AM 1.3 0.0025 29930182260713503753 This (contrived example since I don't have a genuine one handy) is not: Code: linded skynet-newnode052917192671 F Sep 26 2011 12:25AM 1.3 0.0025 29930182260713503753 I would in fact be very happy it the report was presented with full user and computer names, but the trick is presenting it in a format that easy for both humans and computers to read. Tab-delimited is easy to computer-parse, but hard to lay out for human-reading ease. An HTML table is a reasonable choice, but adds a fair amount of browser-rendering overhead, as well as additional data overhead for each page. A fair compromise would be to preserve the space-delimited as a preferred format, but extend the field widths to accommodate longer names.
2011-09-26, 01:32   #4
davieddy

"Lucan"
Dec 2006
England

2·3·13·83 Posts

Quote:
 Originally Posted by cheesehead (In general, I don't like Procrustean suggestions that people should be made to neatly fit computers in some regard. People aren't numbers! )
I have no objection to folk giving their pets long names, but I do
sympathise with James:

100M+ digit numbers until one finds a protrusion.
Aha you think, looks like a long P-1 factor.
WRONG.

David

2011-09-26, 02:21   #5
Mini-Geek
Account Deleted

"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17·251 Posts

Quote:
 Originally Posted by James Heinrich ...but the trick is presenting it in a format that easy for both humans and computers to read...
Simple solution to that: provide it in both formats. Make one optimized for a human to read on a web page, and put a link there to a CSV version of it so computers can easily read it. I know the update is done hourly because there's a significant amount of computation, but it shouldn't be much harder to save a human-display-friendly and a CSV version once the data is processed.

Last fiddled with by Mini-Geek on 2011-09-26 at 02:22

2011-09-26, 02:34   #6
James Heinrich

"James Heinrich"
May 2004
ex-Northern Ontario

2×3×7×83 Posts

Quote:
 Originally Posted by Mini-Geek Simple solution to that: provide it in both formats.
Sometimes the solution is so obvious I can't think of it
That gets my vote!

(maybe we could get the longer versions of the reports at the same time? )

2011-09-26, 16:55   #7
chalsall
If I May

"Chris Halsall"
Sep 2002

9,887 Posts

Quote:
 Originally Posted by James Heinrich Sometimes the solution is so obvious I can't think of it That gets my vote! (maybe we could get the longer versions of the reports at the same time? )
Indeed -- that is an excellent idea. from me too. I could use it for my GVT work as well.

And, such CSV reports would actually result in less load on the server since much less presentation work is required. Basically it reduces to an inner loop of 'print "$1,$2,$3,$4[...]\n";'.

Thus, longer time periods could be made available (like 1.1 hours of results, instead of only 1000 results).

Of course, this would require about an hour of a human's time to implement....

Last fiddled with by chalsall on 2011-09-26 at 17:09 Reason: ..."to implement"

2011-09-26, 18:09   #8
James Heinrich

"James Heinrich"
May 2004
ex-Northern Ontario

D9E16 Posts

Quote:
 Originally Posted by chalsall Of course, this would require about an hour of a human's time to implement....
Of course, this doesn't have to be George's time... I'm sure you or I or any number of others would be happy to spend that hour (for our own benefit ).

 2011-09-27, 02:16 #9 Christenson     Dec 2010 Monticello 179510 Posts I'd argue that a little effort on the browser side to render a table is, effectively zero. Tabs make excellent delimiters, too, since they don't show up in user or computer names.
 2011-09-27, 11:30 #10 James Heinrich     "James Heinrich" May 2004 ex-Northern Ontario 2×3×7×83 Posts HTML tables take close-to-zero effort for the browser to render for a 10-entry table, but the effort is decidedly non-zero for a 10,000-entry table. As I said in post #3, tab-delimited is both very compact and very easy to parse, and is my preference for the "computer-readable version" of the report. Pure tab-delimited is best, no need to quote around fields, since there is no chance of a tab character being part of the data, and is one less thing to strip out when parsing.
2011-09-27, 20:13   #11
chalsall
If I May

"Chris Halsall"
Sep 2002

988710 Posts

Quote:
 Originally Posted by James Heinrich HTML tables take close-to-zero effort for the browser to render for a 10-entry table, but the effort is decidedly non-zero for a 10,000-entry table. As I said in post #3, tab-delimited is both very compact and very easy to parse, and is my preference for the "computer-readable version" of the report. Pure tab-delimited is best, no need to quote around fields, since there is no chance of a tab character being part of the data, and is one less thing to strip out when parsing.
Indeed.

Present a browser (particularly a modern browser which renders "on the fly" as data is received from the server) with a 10,000 entry table, and watch your CPU load go to 100% for quite some time.

Again, I second James' suggestion. A TSV report of recent reported results covering at least an hour (plus at least one minute) would be very useful.

Although CSV is the standard at Mersenne.org for raw reports, I see no reason why TSV cannot be made available in this case. From a code perspective, it is simply "$1\t$2..." rather than "$1,$2...". And as James and Christenson both said, tabs are unlikely to be allowed in user-names or computer-names while commas maybe, thus making quoting each variable unnecessary.

So how about it George and/or Scott?

Last fiddled with by chalsall on 2011-09-27 at 20:17 Reason: Christenson also noted that tabs don't appear in names.

 Similar Threads Thread Thread Starter Forum Replies Last Post tha PrimeNet 1 2014-01-10 03:13 Chuck GPU to 72 2 2011-12-02 21:17 Phantomas PrimeNet 0 2008-12-27 11:48 AP PrimeNet 1 2008-08-24 16:32 delta_t Data 20 2007-10-27 13:20

All times are UTC. The time now is 13:05.

Tue Sep 28 13:05:49 UTC 2021 up 67 days, 7:34, 3 users, load averages: 1.64, 1.77, 1.67