mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2020-11-18, 23:48   #67
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2×5,813 Posts
Default

I prefer not having to online-register for anything if it can be avoided. Created boot disk using CentOS iso file (CentOS-8.2.2004-x86_64-minimal.iso, DLed from one of the official mirrors), plugged into the KNL and booted from it using the "install" option, got through a bunch of steps with a green "[ OK ]" in the terminal output, then hit a bunch of repeated warnings of form

[time-since-boot] dracut-initqueue[3265]: Warning: dracut-initqueue timeout - starting timeout scripts

After a few minutes of that repeating, got
Code:
       Starting Setup Virtual Console...
[ OK ] Started Setup Virtual Console.
       Starting Dracut Emergency Shell...
Warning: /dev/root does not exist

Generating "/run/initramfs/rdsosreport.txt"
Entering emergency mode. Exit the shell to continue.
Type "journalctl" to view system logs.
You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or /boot after mounting them and attach it to a bug report.

dracut:/#
The journalctl command emulates Linux "more", which is unavailable in this emergency-mode shell environment. Some highlights - note this appears in fact to be a 68-physical-core (4 logical CPUs per physical one, thus 272) system like kriesel's, not the 64-core one I thought I was buying (not that I'm complaining, mind you :) - with a few annotations by me in []:
Code:
smpboot: Allowing 272 CPUs, 0 hotplug CPUs
...
Booting paravirtualized kernel on bare hardware.
...
Kernel command line: BOOT_IMAGE=vmlinux [stuff about the CentOS iso] quiet
Specific versions of hardware are certified with Red Hat Enterprise Linux 8. Please see the list of hardware are certified with Red Hat Ent[line cuts off]
...
x86: Booting SMP configuration:
...
smp: brought up 1 node, 272 CPUs
...
ACPI FADT declares the system doesn't support PCIe ASPM, so disable it
ACPI: bus type PCI registered
...
ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
[bunch of pci-init stuff]
SCSI subsystem initialized
...
can't derive routing for PCI INT D
PCI INT D: not connected
...
New USB device found [stuff re. Linux boot image]
...
can't derive routing for PCI INT D
PCI INT D: no GSI
...
[stuff about loading CentOS kernel signing key]
...
[this is in bright red font]usb 3-8: device descriptor read/64, error -110
...
igb: Intel(R) Gigabit Ethernet Network Driver - version 5.6.0-k
...
sd 3:0:0:0: [sda] Attached SCSI disk [preceding line confirm 1TB, i.e. the SSD I bought gor this build]
Then we get the repeating "dracut" timeout warnings I mentioned above, ending in "Warning: Could not boot." and the Emergency-shell stuff.
ewmayer is offline   Reply With Quote
Old 2020-11-19, 02:57   #68
paulunderwood
 
paulunderwood's Avatar
 
Sep 2002
Database er0rr

70518 Posts
Default

This person claims to have "fixed the problem"

https://forums.centos.org/viewtopic.php?t=63043

by running dracut -f in rescue(?) mode. He might have meant emergency mode,

https://linux.die.net/man/8/dracut

Another solution seems to be to enter "exit" at the dracut prompt. This might be a safer option to try firstly.

Last fiddled with by paulunderwood on 2020-11-19 at 03:22
paulunderwood is offline   Reply With Quote
Old 2020-11-19, 03:57   #69
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2·5,813 Posts
Default

@paul: Thanks for digging that out. Interestingly, it proved unnecessary - I powered the system back up with the boot USB plugged in, waited the several minutes this system needs to run all its BIOS stuff, then hit <f11> at the SuperMicro boot screen. Now for the new part - the first time I did the above, the boot menu listed 3 items correspondng to the boot USB: 'boot general' and under that 'boot partition 1' and 'boot partition 2'. I recalled that after creating the boot USB, dd copied the iso-image to partition 1, but on try #1 I just hit 'boot general'. This time I selected 'boot partition 1' and everything worked, root/user-info all entered and it's copying files and configuring the kernel as I write this. Fingers crossed, time to get dinner and go offline for the evening. Update tomorrow.
ewmayer is offline   Reply With Quote
Old 2020-11-19, 06:47   #70
axn
 
axn's Avatar
 
Jun 2003

114718 Posts
Default

Quote:
Originally Posted by Xyzzy View Post
Most likely you need to remove quiet from the kernel command line. We are not sure how to do this with Ubuntu but we know it is possible. Once that is done you can see the boot message log to determine where the failure is.
https://askubuntu.com/questions/4778...n-quiet-splash

EDIT:- Obviously you need to boot first to do this :-(

Last fiddled with by axn on 2020-11-19 at 06:48
axn is offline   Reply With Quote
Old 2020-11-19, 11:44   #71
xilman
Bamboozled!
 
xilman's Avatar
 
"𒉺𒌌𒇷𒆷𒀭"
May 2003
Down not across

23·113 Posts
Default

Quote:
Originally Posted by axn View Post
https://askubuntu.com/questions/4778...n-quiet-splash

EDIT:- Obviously you need to boot first to do this :-(
Yes, but not necessarily the system you wish to boot.

Extract the boot disk and plug into another Linux box which boots properly. Mount the added disk somewhere, edit to your heart's content, then do a clean shutdown, followed by re-inserting the disk into its own machine.

Done this several times when systems won't boot from their own disk and I don't have a convenient bootable CD/DVD/memstick to hand.
xilman is offline   Reply With Quote
Old 2020-11-19, 14:24   #72
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7·719 Posts
Default

Quote:
Originally Posted by ewmayer View Post
waited the several minutes this system needs to run all its BIOS stuff, then hit <f11> at the SuperMicro boot screen.
Some observations here:
1) Sometimes mine seems to get stuck during the BIOS initialization, requiring a power cycle to try again.
2) The sequence is interminable but the time window for F11, F12, or DEL to select options from the white SuperMicro boot screen is brief. It would be nice to be able to shorten the one or lengthen the other.
3) Haven't experimented with BIOS settings to possibly skip / disable some portions of the initialization.
4) The BIOS seems to support a commercial-size-kitchen-sink set of approaches. Disabling the unused ones might provide a considerable startup speedup, if possible, by eliminating timeout periods for things that ain't gonna happen (IPMI IP# issuance for example).
5) Jumper changes are another possibility. BMC disable.
6) One more way my system is wired oddly; documentation for the motherboard indicates the two adjacent RJ45 jacks are regular LAN ports, but if the one nearer the USB (#7 in fig 5-2 of the manual found online) is connected, DHCP fills in an IPMI IP# (remote console via IP), instead of providing LAN connectivity.

Some good news is checking prime.log and the worker windows of prime95 shows no sign of errors detected, in the 17 workers' 58.3M-59.1M LL DC progress, to 31-35% each so far and a few Jacobi checks each. These should all complete by about month's end.

Last fiddled with by kriesel on 2020-11-19 at 14:26
kriesel is offline   Reply With Quote
Old 2020-11-19, 20:56   #73
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2×5,813 Posts
Default

Install last night finished successfully and after reboot from the installation-on-SSD we got a login prompt, but in basic-terminal mode ... must've overlooked whatever option is needed to install the windowing system in the initial-install screens. So did 'shutdown -h now', just now replugged in the boot USB ... and the system won't power up. Note, the front-panel power button on this is mis-connected, if it's connected at all - have always simply needed to unplug/replug the power cord in the back to turn it off/on, until successful CentOS install last night at least gave us the 'shutdown -h now' option for the 'off' part. Grrr ... no time right now to poke around in the damn case and wiretrace and whatnot. I was hoping to be building and testing code on this sucker by now ... annoying as hell.

Last fiddled with by ewmayer on 2020-11-19 at 20:57
ewmayer is offline   Reply With Quote
Old 2020-11-19, 21:48   #74
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

10011101010012 Posts
Default

Quote:
Originally Posted by ewmayer View Post
... and the system won't power up. Note, the front-panel power button on this is mis-connected, if it's connected at all - have always simply needed to unplug/replug the power cord in the back to turn it off/on, until successful CentOS install last night at least gave us the 'shutdown -h now' option for the 'off' part. Grrr ... no time right now to poke around in the damn case and wiretrace and whatnot. I was hoping to be building and testing code on this sucker by now ... annoying as hell.
Unplug, open case, check JF1 by the main motherboard 24-pin power connector. I had the same won't-start issue after a Windows shutdown -s. But -r (restart) was not typically a problem. It certainly created a sinking feeling. And after that the box was completely unresponsive including to power strip cycling; no POST, no signal at the VGA, most onboard LEDs stayed off, radiator fans did not spin up to mini-propjet-takeoff-sound-level initially, as they previously had. After a while I managed to recover mine by putting a separate power button on JF1 temporarily. Then reconnect power, push the added button, go to BIOS settings, change Power On behavior, from Last State, to Always Start, to reduce the occurrence of repeats.
It's a catch22; system won't restart with power because its last state was off, because that's what the user told it to be through the OS; can't turn it on because the case power switch is not connected; can't change the BIOS because it won't turn on, because...

This case has a sort of "secret compartment". One side panel removed gives a view of cpu, pcie, motherboard-top-side etc and 3.5" drive mountings. The opposite side removed gives 2.5" drive mountings and a view of some cabling. In mine I found an unconnected end of the power sw, reset sw, power LED, drive LED cable in that "secret compartment". So you might be able to activate the front power button by fishing that out and connecting "Power sw" at JF1 pins 1 & 2, if you don't have a spare switches/LEDs cable assembly in your parts drawer.

This thing is like a sports car (or some partners I've known). Expect some cost/pain as the price of the interesting or fun times.
I suspect these were set up to run as part of a server farm, in rows of warehouse style welded-wire racks, with Unit ID LEDs and remote management enabled, local intervention disabled.

Congrats on getting over the OS install hurdles thus far, and thanks for confirming the hardware obstinacy as shipped. I'm wondering how minimal is CentOS -minimal, at 1.6GB. GUI included? The other iso is 7.7GB, too big for a DVD.

FYI prime95 benchmark results on Win10 can be found at https://www.mersenneforum.org/showth...304#post563304. A large screen or a magnifier or some serious zoom may be useful.

Last fiddled with by kriesel on 2020-11-19 at 22:21
kriesel is offline   Reply With Quote
Old 2020-11-20, 20:13   #75
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2D6A16 Posts
Default

@above: Thanks, ken, you're a lifesaver! (Or at least a major time saver).

OK, removed the sidepanel under bottom side of mobo, found the dangling power+led connector bundle. The 2-pin power one has no color coding, all black, no +- polarity marking, just "Power SW" label on one side. I found a downloadable mobo PDF at the Manualslib site I linked to in html-manual form previously, and have attached a snip of the mobo section in question. Say I'm looking at the mobo at the same orientation - big 24-pin main power plug at bottom, smaller 12-pin jumper array above it, 2 power pins at far [strike]left[strike]right end of the latter. Do I want to hook the 2-pin power-button connector to the latter pair of pins so that the "Power SW" label faces leftward or rightward?
Attached Files
File Type: pdf 5038ki_pwrconnects.pdf (215.7 KB, 32 views)
File Type: pdf 5038ki_ctrlpanel.pdf (80.5 KB, 31 views)

Last fiddled with by ewmayer on 2020-11-21 at 03:23
ewmayer is offline   Reply With Quote
Old 2020-11-20, 21:35   #76
Nick
 
Nick's Avatar
 
Dec 2012
The Netherlands

31758 Posts
Default

Is there no CLR CMOS jumper on the mobo any more these days?
Nick is offline   Reply With Quote
Old 2020-11-20, 21:38   #77
PhilF
 
PhilF's Avatar
 
Feb 2005
Colorado

13·47 Posts
Default

Quote:
Originally Posted by ewmayer View Post
Do I want to hook the 2-pin power-button connector to the latter pair of pins so that the "Power SW" label faces leftward or rightward?
For the power switch the polarity doesn't matter.
PhilF is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
AMD vs Intel dtripp Software 3 2013-02-19 20:20
Intel NUC nucleon Hardware 2 2012-05-10 23:53
Intel RNG API? R.D. Silverman Programming 19 2011-09-17 01:43
AMD or Intel mack Information & Answers 7 2009-09-13 01:48
Intel Mac? penguain NFSNET Discussion 0 2006-06-12 01:31

All times are UTC. The time now is 07:25.

Wed Apr 21 07:25:48 UTC 2021 up 13 days, 2:06, 0 users, load averages: 1.76, 2.26, 2.22

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.