mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Data > mersenne.ca

Reply
 
Thread Tools
Old 2016-04-21, 17:32   #78
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

13×277 Posts
Default

From what I can see the database is now 6.2 of an expected 7.6GB after about 9 hours.
James Heinrich is offline   Reply With Quote
Old 2016-04-21, 18:36   #79
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

3·11·89 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
8 hours later it's still chugging along. Not surprising since it involves importing a 10GB (7.6GB data, 13.2GB index, packaged in a 10.5GB SQL file), 116-million-record SQL file. Unfortunately there's no easy way (I know of ?) to check on its progress.
Use pt-online-schema-change (and run it in screen or tmux) for table alters. It's part of the Percona toolkit. You get the added bonus of no downtime. In fact, I'm using it to alter some indexes as I type this on a table that's updated several times a second. The table is about 90 GB with about 323M rows. It reports regular process and says I have about 3 hours to go. It works with all variants of MySQL, not just the Percona flavour.

For imports, you could always pipe it through pv:

pv import.sql | mysql

Not entirely accurate, but easy. If you have file-per-table turned on, you can also watch the size of the table grow on disk. Lastly, you can always watch `show table status like 'table_name';` and compare the row count to `wc -l import.sql`.
Mark Rose is offline   Reply With Quote
Old 2016-04-21, 19:00   #80
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

61×167 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
Not entirely accurate, but easy. If you have file-per-table turned on, you can also watch the size of the table grow on disk. Lastly, you can always watch `show table status like 'table_name';` and compare the row count to `wc -l import.sql`.
I recently encountered a similar, but entirely different, issue.

I wanted to have 8.5" by 14" (AKA Legal paper) perforated in a particular way, and then offset printed with a particular process ink. Pantone 804C, to be specific.

Kind of strange how difficult it is to have atoms processed.

I did finally find a shop who could do the job, but it was going to cost me USD $1,000 just to produce the stamp. And they couldn't print in the ink I needed.
chalsall is offline   Reply With Quote
Old 2016-04-21, 19:13   #81
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

13×277 Posts
Default

I'm just peeking at the filesize of the .MYD file in /var/lib/mysql
Currently 6,803,300,352 (of 8,163,918,264).

Unfortunately the process is slow because (as I understand it) DISABLE KEYS only works for non-unique indexes. And there's a big unique index getting thrashed around in there.

Perhaps pv would be useful next time, but I'm not likely to restart it now just for that
James Heinrich is offline   Reply With Quote
Old 2016-04-21, 19:25   #82
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

3·11·89 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
I'm just peeking at the filesize of the .MYD file in /var/lib/mysql
Currently 6,803,300,352 (of 8,163,918,264).
MyISAM? It's been almost a decade since I used that :)

InnoDB is better in nearly every way since MySQL 5.5, and is the default table engine. It does require a minute or two of server configuration before blindly using it though.

Quote:
Unfortunately the process is slow because (as I understand it) DISABLE KEYS only works for non-unique indexes. And there's a big unique index getting thrashed around in there.
Yes. It has to throw an error before a duplicate row is added, and the fastest way to check that is having the unique index.

Quote:
Perhaps pv would be useful next time, but I'm not likely to restart it now just for that
:)
Mark Rose is offline   Reply With Quote
Old 2016-04-21, 19:33   #83
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

13×277 Posts
Default

I suppose next time I do something foolish like this I should import the datadump without indices, and add them once the data is imported, since I know there are no unique-index conflicts. Lessons learned :)
James Heinrich is offline   Reply With Quote
Old 2016-04-21, 20:25   #84
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

3×11×89 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
I suppose next time I do something foolish like this I should import the datadump without indices, and add them once the data is imported, since I know there are no unique-index conflicts. Lessons learned :)
With MyISAM you shouldn't even need a primary key for the import, so include that in the unique-index list.

(InnoDB will create an internal one if you don't provide one.)
Mark Rose is offline   Reply With Quote
Old 2016-04-22, 02:57   #85
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

360110 Posts
Default

7,931,805,696 of 8,163,918,264 after ~19 hours

I'm almost tempted to re-try the import once this is done as just a data dump with no index at all, see how fast that goes by comparison. And then just add a single primary key that covers my unique criteria. I don't need a separate autoincrement primary key field, and I found I had a couple indexes on there that don't really need to be there, so if that works it would be considerably smaller: data about 5% smaller dropping the autoincrement field, but index only about a third the size. Testing locally it only took 53 minutes to add the primary key on my fields.
James Heinrich is offline   Reply With Quote
Old 2016-04-22, 06:41   #86
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

13·277 Posts
Default

Well, lesson learned there. It took ~20 hours to import the database with indexes updating during import. To import the data (minus the autoincrement column) with no indexes it took... 35 minutes. I'm adding the primary key now, based on local testing that should take about an hour. So about 10x faster.
James Heinrich is offline   Reply With Quote
Old 2016-04-22, 16:27   #87
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

13×277 Posts
Default

So, finally, up and running with the second imported version of the database.
Indexing took a bunch longer on my server (~5 hours) than it did locally (50 mins) -- CPU difference (i7-3930K vs Atom C2750), I'm sure -- but still far far far quicker than importing while updating 2 unique indexes (~22 hours).

However, looking at how some of my other queries reference this table I'm also going to need to add a non-unique index on date_received. That'll take a couple hours I'm sure, but the site should be usable in the meantime.

Last fiddled with by James Heinrich on 2016-04-22 at 16:29
James Heinrich is offline   Reply With Quote
Old 2016-04-22, 17:04   #88
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

332410 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Well, lesson learned there. It took ~20 hours to import the database with indexes updating during import. To import the data (minus the autoincrement column) with no indexes it took... 35 minutes. I'm adding the primary key now, based on local testing that should take about an hour. So about 10x faster.
I feel bad now for not finding time to re-do the data dumps with (made up, if necessary) random timestamps or random milliseconds to make the datetime itself unique, always.
Madpoo is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
The Factoring Report at mersenne.ca Gordon mersenne.ca 21 2019-01-21 02:38
Exponent status report Dubslow PrimeNet 2 2015-10-05 05:21
Mersenne.ca Status Report Gordon mersenne.ca 1 2015-09-22 10:53
Delayed status report (split from main reservation thread) rogue Sierpinski/Riesel Base 5 8 2006-03-04 13:59
V5 status report? PrimeCruncher PrimeNet 11 2005-10-09 18:53

All times are UTC. The time now is 05:19.


Sun Jan 23 05:19:21 UTC 2022 up 183 days, 23:48, 0 users, load averages: 1.24, 1.78, 1.65

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔