mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > PrimeNet > GPU to 72

Reply
 
Thread Tools
Old 2018-01-21, 07:28   #4126
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

32·72·13 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Quote:
In order to keep our services ... reliable ... Your server may be unavailable for 6 to 8 hours while the maintenance is taking place.
Haha, like Alice in Wonderland. To keep it reliable we have to make it unreliable.

Last fiddled with by retina on 2018-01-21 at 07:28
retina is online now   Reply With Quote
Old 2018-01-21, 07:57   #4127
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

722110 Posts
Default

Quote:
Originally Posted by retina View Post
Haha, like Alice in Wonderland. To keep it reliable we have to make it unreliable.
Reliable != available. Reliable is much closer in meaning to "predictable" in this case, and this is far more predictable than not performing the maintenance.
Dubslow is offline   Reply With Quote
Old 2018-02-17, 20:16   #4128
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

41·107 Posts
Default

Mersenne work distribution has about 5,000 assignments in the 4xM ranges.
Gpu72 has about 9,500
petrw1 is online now   Reply With Quote
Old 2018-04-13, 18:23   #4129
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

23·379 Posts
Default

Quote:
Originally Posted by chalsall View Post
So, one of the drives on the GPU72 server is failing.
It is scheduled to be replaced in the next hour or so. It's "hot-swappable", and one of the drives in a RAID1 set, so there should be no downtime.
And now mersenne.ca has the same problem.
Code:
A Fail event had been detected on md device /dev/md3.
It could be related to component device /dev/sdb3.

A Fail event had been detected on md device /dev/md1.
It could be related to component device /dev/sdb1.
Likewise, the drive is supposed to be hot-swapped within the next few hours. No surprise, the drives in my server are also Seagate. And yet a quick SMART check passed on both drives
Code:
=== START OF INFORMATION SECTION ===
Model Family:     Seagate Constellation CS
Device Model:     ST1000NC000-1CX162
Serial Number:    Z1D7MCPW
LU WWN Device Id: 5 000c50 05d012139
Firmware Version: CE02
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:    Fri Apr 13 14:20:25 2018 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED




=== START OF INFORMATION SECTION ===
Model Family:     Seagate Constellation CS
Device Model:     ST1000NC000-1CX162
Serial Number:    Z1DAZ0ZY
LU WWN Device Id: 5 000c50 0669d8a73
Firmware Version: CE02
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:    Fri Apr 13 14:19:28 2018 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
James Heinrich is offline   Reply With Quote
Old 2018-04-13, 19:13   #4130
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2×4,657 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
And now mersenne.ca has the same problem.
Code:
A Fail event had been detected on md device /dev/md3.
It could be related to component device /dev/sdb3.

A Fail event had been detected on md device /dev/md1.
It could be related to component device /dev/sdb1.
VERY sorry to hear that. The good news is the failures were both reported from the same physical drive in the pair (/dev/sdb) so you hopefully won't need to do a full restore from backups.

Quote:
Originally Posted by James Heinrich View Post
Likewise, the drive is supposed to be hot-swapped within the next few hours. No surprise, the drives in my server are also Seagate. And yet a quick SMART check passed on both drives
You will still have to reboot your server to bring the hot-swapped drive back into the "correct" logical location. After the hot-swap the new drive will likely appear as /dev/sdc; you can then work with it (setting up the partition table, etc).

One last thing... Don't trust the SMART short test; always periodically run the long.

Oh, also... My previously documented experience taught me that minimizing CPU usage and HD writes helped a great deal with the RAID rebuild time.
chalsall is offline   Reply With Quote
Old 2018-04-24, 20:49   #4131
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

32·11·101 Posts
Default Just received accessing GPU72

FYI only Chris. It resolved within a minute.


Error 1001 Ray ID: 410b6b51e2235534 • 2018-04-24 20:46:25 UTC

DNS resolution error


What happened?

You've requested a page on a website (www.gpu72.com) that is on the Cloudflare network. Cloudflare is currently unable to resolve your requested domain (www.gpu72.com). There are two potential causes of this:
  • Most likely: if the owner just signed up for Cloudflare it can take a few minutes for the website's information to be distributed to our global network.
  • Less likely: something is wrong with this site's configuration. Usually this happens when accounts have been signed up with a partner organization (e.g., a hosting provider) and the provider's DNS fails.
kladner is offline   Reply With Quote
Old 2018-04-24, 21:00   #4132
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2×4,657 Posts
Default

Quote:
Originally Posted by kladner View Post
You've requested a page on a website (www.gpu72.com) that is on the Cloudflare network.
Hmmm... Something's odd there. GPU72.com is NOT on Cloudflare; DNS should resolve to 74.208.74.21, which is a 1&1 dedicated server.

Possibly someone is trying a DNS poisoning attack (although I can't imagine why).

Is anyone else seeing this? And please ensure your browser doesn't give any SSL Cert warnings when you try to log in.
chalsall is offline   Reply With Quote
Old 2018-04-24, 21:40   #4133
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

7×409 Posts
Default

There was a BGP attack against Route53 this morning.

https://www.theregister.co.uk/2018/0...et_dns_hijack/

It's possible related shenanigans are afoot.
Mark Rose is online now   Reply With Quote
Old 2018-05-18, 05:57   #4134
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

28·32 Posts
Default

Are we short on workers doing trial factoring at the LL wavefront?

I've noticed that a lot of exponents are being assigned for P-1 just days after being factored to 76 bits. I also haven't seen that many TF assignments at the LL wavefront turned in lately.
ixfd64 is offline   Reply With Quote
Old 2018-05-18, 12:48   #4135
VictordeHolland
 
VictordeHolland's Avatar
 
"Victor de Hollander"
Aug 2011
the Netherlands

23×3×72 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
Are we short on workers doing trial factoring at the LL wavefront?
There are 50,000 LL assignments available in the 80-90M range that have been TF-ed to 75 or 76bits so no need to worry .
VictordeHolland is offline   Reply With Quote
Old 2018-05-20, 15:50   #4136
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

246216 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
Are we short on workers doing trial factoring at the LL wavefront? I've noticed that a lot of exponents are being assigned for P-1 just days after being factored to 76 bits. I also haven't seen that many TF assignments at the LL wavefront turned in lately.
Sorry for the latency in the reply; been busy doing a CFD analysis.

Yes, we are a bit low in LLTF resources at the moment. One of our biggest contributors has dropped off his throughput by about 70% because he's had to reallocate some of his resources.

We should still be OK in that we're far ahead of the LL'ing wavefront. But we might have to start giving the P-1'ers work "only" TF'ed to 75, and then take them up to 76 later, before letting the LL'ers at them.

But if anyone has any resources they could allocate to LLTF'ing (preferably to at least 75 bits) it would be appreciated.
chalsall is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Status Primeinator Operation Billion Digits 5 2011-12-06 02:35
62 bit status 1997rj7 Lone Mersenne Hunters 27 2008-09-29 13:52
OBD Status Uncwilly Operation Billion Digits 22 2005-10-25 14:05
1-2M LLR status paulunderwood 3*2^n-1 Search 2 2005-03-13 17:03
Status of 26.0M - 26.5M 1997rj7 Lone Mersenne Hunters 25 2004-06-18 16:46

All times are UTC. The time now is 05:13.

Sat Sep 26 05:13:29 UTC 2020 up 16 days, 2:24, 0 users, load averages: 1.86, 1.79, 1.61

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.