2020-11-12, 01:56   #56
EdH

"Ed Hall"
Dec 2009
Adirondack Mtns

2·32·197 Posts

Quote:
 Originally Posted by Nick I don't use Ubuntu but in my Linux distribution you can press Escape during the boot dots graphic to get the old-fashioned boot messages on the console instead. which may be more helpful.
This works in Ubuntu as well. Subsequent ESC presses toggle between the graphic and the text listing.

 2020-11-12, 02:17 #57 kriesel     "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 10010111100112 Posts Sign of life After a long unsatisfying day of trying to revive my network after a bad overnight storm (it could be worse; the neighbor's new shed rolled nearly to their sun porch, uphill, and is looking very much the worse for its FOURTH excursion off its foundation), was inspired by Ernst's update and tempted fate further with Hydra + Windows. Probably should have gone with evaluating on "Windows Pro for Workstation" instead of "Windows Pro" for more cores & cache supported. Time stamp on attachment is PST from before correcting to my time zone. Elapsed time for OS install including rounding up media and an external USB DVD drive ~2 hours including nudging it a bit now and then while mostly doing other things. Drive is partitioned so there is room left for a Linux install. That's the good news. Now the other kind. Windows subsets the core count rather severely. If Windows for Workstation allows twice what Windows Pro does, that's still 56 out of 68. I assume mprime in WSL2 would not be able to bring any more cores to bear. In any event, it is impressive to see 112 logical processors represented in Resource Monitor at this point, especially given the low used hardware price. Prime95 v30.3b6 generates this error in the application log, without ever making an appearance on the screen. Code: Faulting application name: prime95.exe, version: 30.3.1.0, time stamp: 0x5f5ae7c7 Faulting module name: libhwloc-15.dll, version: 0.0.0.0, time stamp: 0x5e820fc3 Exception code: 0xc0000005 Fault offset: 0x0000000000011cc8 Faulting process id: 0x30d0 Faulting application start time: 0x01d6b89b6e30c9ab Faulting application path: C:\Users\ken\Documents\prime95\prime95.exe Faulting module path: C:\Users\ken\Documents\prime95\libhwloc-15.dll Report Id: e2331268-fb28-457e-adc7-288514d55ef5 Faulting package full name: Faulting package-relative application ID: Haven't tried mlucas or mfactor yet. Attached Thumbnails   Last fiddled with by kriesel on 2020-11-12 at 03:04
 2020-11-12, 03:25 #58 kriesel     "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 32×72×11 Posts CPUZ view of the Windows-flavor-subsetted hardware, and a benchmark on it. Feature Detector output: Code: CPU Vendor String: GenuineIntel CPU Vendor: AMD = No Intel = Yes OS Features: 64-bit = Yes OS AVX = Yes OS AVX512 = Yes Hardware Features: MMX = Yes x64 = Yes ABM = Yes RDRAND = Yes RDSEED = Yes BMI1 = Yes BMI2 = Yes ADX = Yes MPX = No PREFETCHW = No PREFETCHWT1 = Yes RDPID = No GFNI = No VAES = No SIMD: 128-bit SSE = Yes SSE2 = Yes SSE3 = Yes SSSE3 = Yes SSE4a = No SSE4.1 = Yes SSE4.2 = Yes AES-NI = Yes SHA = No SIMD: 256-bit AVX = Yes XOP = No FMA3 = Yes FMA4 = No AVX2 = Yes SIMD: 512-bit AVX512-F = Yes AVX512-CD = Yes AVX512-PF = Yes AVX512-ER = Yes AVX512-VL = No AVX512-BW = No AVX512-DQ = No AVX512-IFMA = No AVX512-VBMI = No AVX512-VPOPCNTDQ = No AVX512-4FMAPS = No AVX512-4VNNIW = No AVX512-VBMI2 = No AVX512-VPCLMUL = No AVX512-VNNI = No AVX512-BITALG = No AVX512-BF16 = No Summary: Safe to use AVX: Yes Safe to use AVX512: Yes Attached Thumbnails   Last fiddled with by kriesel on 2020-11-12 at 04:17
 2020-11-12, 22:53 #59 ewmayer ∂2ω=0     Sep 2002 República de California 2×3×1,931 Posts Thanks to Nick et al for the boot-diagnostic tips - I may not have further time to play with the KNL until the weekend, will first try the above and glean whatever diagnostics I can, will also dig out the old let's-buy-GIMPS-a-KNL thread and see what OS what installed on that. It may simply turn out to be an issue of KNL needing one of some specific subset of server-class Linux distros. @Ken: Did you buy your Hydra from the same eBay vendor you pointed me to? Also note: " If Windows for Workstation allows twice what Windows Pro does, that's still 56 out of 68. I assume mprime in WSL2 would not be able to bring any more cores to bear. In any event, it is impressive to see 112 logical processors represented in Resource Monitor" -- IIRC the KNL is actually *four* logical processors for each physical one. If you're only seeing 2-per that is likely another limitation of your Windows version, but as you note w.r.to 56-of-68, codes which can actually make use of more than (say) 2 logical processors per physical one are likely rare.
2020-11-12, 23:55   #60
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

485110 Posts

Quote:
 Originally Posted by ewmayer Thanks to Nick et al for the boot-diagnostic tips - I may not have further time to play with the KNL until the weekend, will first try the above and glean whatever diagnostics I can, will also dig out the old let's-buy-GIMPS-a-KNL thread and see what OS what installed on that. It may simply turn out to be an issue of KNL needing one of some specific subset of server-class Linux distros. @Ken: Did you buy your Hydra from the same eBay vendor you pointed me to?
Yes, and first. Taking no chances.

Quote:
 Also note: " If Windows for Workstation allows twice what Windows Pro does, that's still 56 out of 68. I assume mprime in WSL2 would not be able to bring any more cores to bear. In any event, it is impressive to see 112 logical processors represented in Resource Monitor" -- IIRC the KNL is actually *four* logical processors for each physical one. If you're only seeing 2-per that is likely another limitation of your Windows version, but as you note w.r.to 56-of-68, codes which can actually make use of more than (say) 2 logical processors per physical one are likely rare.
Windows 10 Pro recognized 28 cores and 112 logical processors, so at the expected 4:1 ratio. See the Taskmgr screen shot that shows those numbers along with an unexpected claim of 2 sockets, at post 57, or the cpu-z left instance, bottom right portion at post 58. An install of Win 10 Pro for Workstations "MIGHT" double both those, and still fall short of the 68 actual cores, 272 logical that a 7250 implies, with license cost ~\$300. Not sold on that bargain. Will try adding on WSL / Ubuntu, and bootable Ubuntu, as time allows.

My network remains recalcitrant, and currently I have some of it isolated, so that the rest functions. I suspect a jabbering NIC or bad switch lurking among the quarantined few items. Something goes awry that causes DHCP to fail and remote desktop sessions to fail and icmp ping to fail on the wireless especially. The quarantined stuff is all wired.

 2020-11-13, 01:35 #61 kriesel     "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 10010111100112 Posts Updating to Windows 10 Pro build 1909 paid off Now sees 1 socket, 68 real cores, 272 logical cores, and all cache. And prime95 v30.3b6 successfully launches and recommends 17 4-core workers. Benchmarking is under way for 2M to 8M on selected worker counts. Based on the CPUZ benchmarking and comparisons to other cpus, it will be faster than an i7-4790 but not as fast as recent chips. Attached Thumbnails     Last fiddled with by kriesel on 2020-11-13 at 01:54
2020-11-15, 11:44   #62
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

32×72×11 Posts

Quote:
 Originally Posted by ewmayer but never got to the hoped-for login prompt, system is hung with black screen with mouse cursor frozen. No more time for this today, tips as to what to try tomorrow welcome.
WSL2 install was a bust. The cpu lacks some required virtualization support.

On Windows, I have gotten to where it will run prime95. Benchmarking with hyperthreading in prime95 does x2, x3, and x4, which are almost always progressively slower.

Running higher prime95 benchmarking overnight, it has settled into a state after several hours where the screen is black except for a slowly responding white mouse cursor. My recollection is Windows was already configured not to put the system into standby or hibernation ever.
It is unresponsive at the console or by remote desktop, but responds to ping, and to the Numlock key slowly. This is close to what Ernst describes, but my system has not yet had Linux installed, and his has presumably not had Windows installed. Perhaps disabling hibernation in ACPI settings is needed, or a careful review of Windows power settings, or both.

My unit is wired... peculiarly. There are no connections to JF1 (which has pinouts for power switch, reset switch, power LED, HDD LED, NICs LEDs etc). The case-front power button has no effect.

The hardware seller's stock is now sold out.

Last fiddled with by kriesel on 2020-11-15 at 11:44

 2020-11-15, 16:41 #63 kriesel     "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 10010111100112 Posts Wasn't expecting this in a benchmark run; never seen it before. Occurrence of "INF". First appearance at 18M and becomes more prominent as fft length increases. Some "15 second" benchmark timings take up to an hour on the Xeon Phi 7250 running prime95 v30.3b6. In my unit's ACPI configuration in BIOS, there are only two lines, neither having any mention of or apparent relation to hibernation. Last fiddled with by kriesel on 2020-11-15 at 16:42
2020-11-18, 20:09   #64
ewmayer
2ω=0

Sep 2002
República de California

2·3·1,931 Posts

Finally completed a bunch of proof-of-principle coding/testing re. the p-1 algo which will debut in Mlucas v20 yesterday, had a little time to play with the KNL I setup with Ubuntu 19.10 again - on boot, once it gets to the Ubuntu-load screen, htting Esc has no effect, it simply continues on to PSOD (purple screen of death) as before. Again, my expectation was that the KNL would need some enterprise/server flavor of Linux, possibly also a custom Intel software stack. Did a bit of digging:

o From the Let's Buy GIMPS a KNL thread, post #43: "it comes preconfigured with all the tools and CentOS", said Linux distro being a free enterprise-oriented fork of Redhat, whose own enterprise distro, RHEL, requires a paid subscription.

o Intel's website, OTOH, describes the Intel Manycore Platform Software Stack:
Quote:
 Manycore Platform Software Stack." is necessary to run the Intel® Xeon Phi Coprocessor. Users often call this stack "MPSS" for short. It is dependent on Linux kernels 2.6.34 or later , and it has been tested to work with specific versions of 64-bit Operating Systems: o Red Hat Enterprise 6.0, 6.1, 6.2, 6.3, 6.4 and 6.5 (for MPSS 3.2 and earlier releases); versions 6.3, 6.4, 6.5, 6.6 and 7.0 (MPSS 3.3 and 3.4), versions 6.4, 6.5, 6.6 and 7.1 (MPSS 3.5), versions 6.7, version 7.2 (MPSS 3.6 and 3.7) and version 7.3 (MPSS 3.8) o SuSE Linux Enterprise Server (SLES) 11 SP1 and SP2 (MPSS 2.1), SuSE Linux Enterprise Server (SLES) 11 SP2 and SP3 (MPSS 3.3), SuSE 11 SP2 and SP3, SuSE 12 (MPSS 3.4), SuSE 11 SP3 and SuSE 12 (MPSS 3.5), SuSE 11 SP4, SuSE 12 SP1 (MPSS 3.6 and 3.7), SuSE 12 SP2 (MPSS 3.8) o Microsoft* Windows 7 Enterprise SP1, Windows 8/8.1 Enterprise, Windows 10, Windows Server 2008 R2 SP1, Windows Server 2012 and Windows Server 2012 R2 The readme files (referenced in the Download section) have more information on how to build and install the stack. The open source updates we have made are in support of the instruction set, the ABI, initializing and controlling an SMP on-a-chip, and the glue software to support the coprocessor communication with the host system. The changes in the Linux kernel are primarily for three reasons: o Numerous little changes to support the unique combination of an Intel® Pentium® processor core that also supports 64-bits including the Intel® Initial Many Core Instructions (Intel® IMCI). o Power management, which is a feature not associated with the original Pentium processors. Power management is much more important when you have up to 61 cores on a single die o The Intel® Many Integrated Core (MIC) check architecture, also a feature not present in the original Pentium processor designs. The Symmetric Communications InterFace (SCIF) is included in the RPM bundle. SCIF provides a mechanism for inter-node communications within a single platform. A node, for SCIF purposes, is defined as either a Intel® Xeon Phi Coprocessor or the Intel® Xeon® processor. In particular, SCIF abstracts the details of communicating over the PCI Express bus. The SCIF APIs are callable from both user space (uSCIF) and kernel-space (kSCIF).
Now if CentOS works as advertised, it should be drop-in-able in place of RHEL, right? And are all those Intel special tools likely must-haves, or nice-to-haves? (E.g. the SCIF sounds like something people with multiple Xeon processors might want or need, single-Xeon-ers like me less so.)

2020-11-18, 20:49   #65
Xyzzy

"Mike"
Aug 2002

1F0B16 Posts

Quote:
 For developers, getting Red Hat Enterprise Linux is now easier than ever thanks to the availability of the no-cost Red Hat Developer Subscription.
https://developers.redhat.com/articl...sers-need-know

2020-11-18, 20:52   #66
Xyzzy

"Mike"
Aug 2002

32×883 Posts

Quote:
 Originally Posted by ewmayer …Ubuntu 19.10 again - on boot, once it gets to the Ubuntu-load screen, htting Esc has no effect, it simply continues on to PSOD (purple screen of death) as before.
Most likely you need to remove quiet from the kernel command line. We are not sure how to do this with Ubuntu but we know it is possible. Once that is done you can see the boot message log to determine where the failure is.

