mersenneforum.org > Data Mersenne.ca Status Report
 Register FAQ Search Today's Posts Mark Forums Read

 2016-04-21, 17:32 #78 James Heinrich     "James Heinrich" May 2004 ex-Northern Ontario 13·277 Posts From what I can see the database is now 6.2 of an expected 7.6GB after about 9 hours.
2016-04-21, 18:36   #79
Mark Rose

"/X\(‘-‘)/X\"
Jan 2013

293710 Posts

Quote:
 Originally Posted by James Heinrich 8 hours later it's still chugging along. Not surprising since it involves importing a 10GB (7.6GB data, 13.2GB index, packaged in a 10.5GB SQL file), 116-million-record SQL file. Unfortunately there's no easy way (I know of ?) to check on its progress.
Use pt-online-schema-change (and run it in screen or tmux) for table alters. It's part of the Percona toolkit. You get the added bonus of no downtime. In fact, I'm using it to alter some indexes as I type this on a table that's updated several times a second. The table is about 90 GB with about 323M rows. It reports regular process and says I have about 3 hours to go. It works with all variants of MySQL, not just the Percona flavour.

For imports, you could always pipe it through pv:

pv import.sql | mysql

Not entirely accurate, but easy. If you have file-per-table turned on, you can also watch the size of the table grow on disk. Lastly, you can always watch show table status like 'table_name'; and compare the row count to wc -l import.sql.

2016-04-21, 19:00   #80
chalsall
If I May

"Chris Halsall"
Sep 2002

61×167 Posts

Quote:
 Originally Posted by Mark Rose Not entirely accurate, but easy. If you have file-per-table turned on, you can also watch the size of the table grow on disk. Lastly, you can always watch show table status like 'table_name'; and compare the row count to wc -l import.sql.
I recently encountered a similar, but entirely different, issue.

I wanted to have 8.5" by 14" (AKA Legal paper) perforated in a particular way, and then offset printed with a particular process ink. Pantone 804C, to be specific.

Kind of strange how difficult it is to have atoms processed.

I did finally find a shop who could do the job, but it was going to cost me USD \$1,000 just to produce the stamp. And they couldn't print in the ink I needed.

 2016-04-21, 19:13 #81 James Heinrich     "James Heinrich" May 2004 ex-Northern Ontario 13·277 Posts I'm just peeking at the filesize of the .MYD file in /var/lib/mysql Currently 6,803,300,352 (of 8,163,918,264). Unfortunately the process is slow because (as I understand it) DISABLE KEYS only works for non-unique indexes. And there's a big unique index getting thrashed around in there. Perhaps pv would be useful next time, but I'm not likely to restart it now just for that
2016-04-21, 19:25   #82
Mark Rose

"/X\(‘-‘)/X\"
Jan 2013

1011011110012 Posts

Quote:
 Originally Posted by James Heinrich I'm just peeking at the filesize of the .MYD file in /var/lib/mysql Currently 6,803,300,352 (of 8,163,918,264).
MyISAM? It's been almost a decade since I used that :)

InnoDB is better in nearly every way since MySQL 5.5, and is the default table engine. It does require a minute or two of server configuration before blindly using it though.

Quote:
 Unfortunately the process is slow because (as I understand it) DISABLE KEYS only works for non-unique indexes. And there's a big unique index getting thrashed around in there.
Yes. It has to throw an error before a duplicate row is added, and the fastest way to check that is having the unique index.

Quote:
 Perhaps pv would be useful next time, but I'm not likely to restart it now just for that
:)

 2016-04-21, 19:33 #83 James Heinrich     "James Heinrich" May 2004 ex-Northern Ontario 13·277 Posts I suppose next time I do something foolish like this I should import the datadump without indices, and add them once the data is imported, since I know there are no unique-index conflicts. Lessons learned :)
2016-04-21, 20:25   #84
Mark Rose

"/X\(‘-‘)/X\"
Jan 2013

B7916 Posts

Quote:
 Originally Posted by James Heinrich I suppose next time I do something foolish like this I should import the datadump without indices, and add them once the data is imported, since I know there are no unique-index conflicts. Lessons learned :)
With MyISAM you shouldn't even need a primary key for the import, so include that in the unique-index list.

(InnoDB will create an internal one if you don't provide one.)

 2016-04-22, 02:57 #85 James Heinrich     "James Heinrich" May 2004 ex-Northern Ontario 70218 Posts 7,931,805,696 of 8,163,918,264 after ~19 hours I'm almost tempted to re-try the import once this is done as just a data dump with no index at all, see how fast that goes by comparison. And then just add a single primary key that covers my unique criteria. I don't need a separate autoincrement primary key field, and I found I had a couple indexes on there that don't really need to be there, so if that works it would be considerably smaller: data about 5% smaller dropping the autoincrement field, but index only about a third the size. Testing locally it only took 53 minutes to add the primary key on my fields.
 2016-04-22, 06:41 #86 James Heinrich     "James Heinrich" May 2004 ex-Northern Ontario 13×277 Posts Well, lesson learned there. It took ~20 hours to import the database with indexes updating during import. To import the data (minus the autoincrement column) with no indexes it took... 35 minutes. I'm adding the primary key now, based on local testing that should take about an hour. So about 10x faster.
 2016-04-22, 16:27 #87 James Heinrich     "James Heinrich" May 2004 ex-Northern Ontario 13·277 Posts So, finally, up and running with the second imported version of the database. Indexing took a bunch longer on my server (~5 hours) than it did locally (50 mins) -- CPU difference (i7-3930K vs Atom C2750), I'm sure -- but still far far far quicker than importing while updating 2 unique indexes (~22 hours). However, looking at how some of my other queries reference this table I'm also going to need to add a non-unique index on date_received. That'll take a couple hours I'm sure, but the site should be usable in the meantime. Last fiddled with by James Heinrich on 2016-04-22 at 16:29
2016-04-22, 17:04   #88
Serpentine Vermin Jar

Jul 2014

22×3×277 Posts

Quote:
 Originally Posted by James Heinrich Well, lesson learned there. It took ~20 hours to import the database with indexes updating during import. To import the data (minus the autoincrement column) with no indexes it took... 35 minutes. I'm adding the primary key now, based on local testing that should take about an hour. So about 10x faster.
I feel bad now for not finding time to re-do the data dumps with (made up, if necessary) random timestamps or random milliseconds to make the datetime itself unique, always.

 Similar Threads Thread Thread Starter Forum Replies Last Post Gordon mersenne.ca 21 2019-01-21 02:38 Dubslow PrimeNet 2 2015-10-05 05:21 Gordon mersenne.ca 1 2015-09-22 10:53 rogue Sierpinski/Riesel Base 5 8 2006-03-04 13:59 PrimeCruncher PrimeNet 11 2005-10-09 18:53

All times are UTC. The time now is 05:57.

Sun Jan 23 05:57:44 UTC 2022 up 184 days, 26 mins, 0 users, load averages: 0.90, 1.02, 1.17