mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   Knights Landing reservations (https://www.mersenneforum.org/showthread.php?t=20152)

ewmayer 2016-08-30 03:35

[QUOTE=airsquirrels;441033]Well in that case, has anyone looked at HJWASM as a viable option? MASM compatible with support for AVX512-F....[/QUOTE]

I use GCC-syntax inline assembly in my work, so am unable to offer any information on this subject.

So far, we have these folks in the source for our KL crowd:

airsquirrels (david)
ewmayer (ernst)
ATH (andreas)
Madpoo (aaron)

David, could you clarify what you meant by "getting one on the R&D budget"? You mean your personal funds, or some corporate entity? (The distinction may matter for purposes of academic-style multiuser licensing.)

airsquirrels 2016-08-30 12:22

[QUOTE=ewmayer;441040]I use GCC-syntax inline assembly in my work, so am unable to offer any information on this subject.

So far, we have these folks in the source for our KL crowd:

airsquirrels (david)
ewmayer
ATH

David, could you clarify what you meant by "getting one on the R&D budget"? You mean your personal funds, or some corporate entity? (The distinction may matter for purposes of academic-style multiuser licensing.)[/QUOTE]

I was originally indicating that I could possibly get an entire hardware unit under my business for R&D, however if we are doing a pool I don't mind contributing personally.

I would think we could get the license for the software under academic even if the hardware is donated by the business, however I am not a lawyer.

Madpoo 2016-08-30 15:18

[QUOTE=airsquirrels;441060]I was originally indicating that I could possibly get an entire hardware unit under my business for R&D, however if we are doing a pool I don't mind contributing personally.

I would think we could get the license for the software under academic even if the hardware is donated by the business, however I am not a lawyer.[/QUOTE]

I can contribute the hardware purchase as well.

For licensing, I'd imagine there's no problem separating the licensing part from the hardware itself... plenty of researchers run software under academic licensing on hardware that could be corporately owned (think of any of the cloud services...those are most definitely "for profit" hardware).

Essentially this would be a "cloud" device limited to certain folks who may be using software licensed under some other plan (academic, whatever).

airsquirrels 2016-08-30 23:49

I have started working on getting gwnum to assemble with HJWASM. I was able to get cpuidhlp.obj going without too much issue, which was a significant advancement over my attempts to port that code to NASM.

So far I only seem to have one big problem and that's with the AVX instructions. George has all the XMM/YMM etc values as QWORD PTR in extrn.mac, which MASM must be happy to just treat as memory pointers to the correct type regard less of whether i.e. subsd(m64 operand) or subpd (m128 operand) is using them. HJWASM seems to want XMMWORD (OWORD) YWORD, ZWORD etc. explicitly in the PTR. Unfortunately George is quite clever and frequently reuses XMM_ variables as m64 when he only needs the bottom double, so I can't just change the type of XPTR. I will ask around on the HJWASM forums and see if there is a compatibility flag that will ease this issue, otherwise I have a mess of macros to update...

ewmayer 2016-08-31 00:56

[QUOTE=airsquirrels;441119]I have started working on getting gwnum to assemble with HJWASM. I was able to get cpuidhlp.obj going without too much issue, which was a significant advancement over my attempts to port that code to NASM.

So far I only seem to have one big problem and that's with the AVX instructions.[/QUOTE]

I presume your work is aimed at getting a version of the Prime95 source buildable-by-ICC and thus amenable to tuning for KL using Intel's toolsuite, is that right?

Because Aaron noted one user is already turning in results from Prime95 (or mprime?) running on a KL - would that imply current AVX/AVX2 binaries will run on KL without modification?

p.s.: We need one more shared-system pool signer-upper to get the per-person cost under $1000 ... do we have a 5th hardy pioneer in the audience?

airsquirrels 2016-08-31 01:12

[QUOTE=ewmayer;441122]I presume your work is aimed at getting a version of the Prime95 source buildable-by-ICC and thus amenable to tuning for KL using Intel's toolsuite, is that right?

Because Aaron noted one user is already turning in results from Prime95 (or mprime?) running on a KL - would that imply current AVX/AVX2 binaries will run on KL without modification?[/QUOTE]

As far as I know nothing should prevent existing code from running on KL.

Currently AFAIK AVX512 is not supported by MASM, so to do any work will require an assembler supporting those instructions. That should not necessary for thread/cache tuning on the KNL architecture, however.

My understanding is using ICC is mostly beneficial for C/C and intrinsics where the compiler is doing the optimizations. Given that almost all of prime95's math is in assembly, I'm not sure how much use it will be. I admit I have not dug deeply into this and I'm not sure how powerful the profiling tools are at the assembly level to aid in tuning.

Prime95 2016-08-31 03:24

[QUOTE=airsquirrels;441119]
So far I only seem to have one big problem and that's with the AVX instructions. George has all the XMM/YMM etc values as QWORD PTR in extrn.mac, which MASM must be happy to just treat as memory pointers to the correct type[/QUOTE]

I have ho problem updating the source code to be a little stricter regarding typing.

HJWASM may turn out to be a good solution assuming it really is MASM compatible.

Madpoo 2016-08-31 04:55

[QUOTE=ewmayer;441122]I presume your work is aimed at getting a version of the Prime95 source buildable-by-ICC and thus amenable to tuning for KL using Intel's toolsuite, is that right?

Because Aaron noted one user is already turning in results from Prime95 (or mprime?) running on a KL - would that imply current AVX/AVX2 binaries will run on KL without modification?[/QUOTE]

That's correct... that's due to the x86 compatibility baked into the Atom cores and they have the full complement of AVX, SSE2, etc. support.

Like I mentioned though, running that way it's basically no different than a bunch of slow (1.3 GHz, in this case) cores. I believe this particular user is running 16 workers out of the 64 cores.

If I look even further back at the assignment history, there were some anonymous users earlier this year (as early as March) who started working assignments on a Knights Landing, but sadly never finished them. Must have been doing some benchmarking... in those cases they had 64 workers going using all of the cores.

I wonder how it was doing with 64 workers going... their last check-in had 'em up to only 2-3% complete, and that was a month after being assigned. For exponents in the 44M range that might be what I'd expect for a 1.3 GHz processor...

Now, if you had a single worker using 64 cores... whew... I'd like to see that. Using the 16GB of HBM available? Yeah... even without any code changes at all I'm guessing it would fly.

airsquirrels 2016-08-31 12:57

[QUOTE=Prime95;441132]I have ho problem updating the source code to be a little stricter regarding typing.

HJWASM may turn out to be a good solution assuming it really is MASM compatible.[/QUOTE]

Here is my post over at masm32 regarding the incompatibility. My rudimentary grepping shows about 1000 places to update m64 references to Q_XMM or similar pointers and about the same to update m128 reference to XMMWORDs.

[url]http://masm32.com/board/index.php?topic=5633.0[/url]

Maybe they will add compatibility.

Otherwise, if you have a preference for how you want that work done I don't mind doing it.

airsquirrels 2016-08-31 12:59

[QUOTE=Madpoo;441142]...
Now, if you had a single worker using 64 cores... whew... I'd like to see that. Using the 16GB of HBM available? Yeah... even without any code changes at all I'm guessing it would fly.[/QUOTE]

Well just as soon as the other few participants chip in we can find out!

GP2 2016-09-01 00:01

[QUOTE=airsquirrels;441119]I have started working on getting gwnum to assemble with HJWASM.[/QUOTE]

This is unrelated to the discussion here, but it occurs to me that it might be very interesting if the gwnum library was available as a C extension module for Python. In principle it shouldn't be too hard. Has this ever been attempted?


All times are UTC. The time now is 21:12.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.