mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Msieve (https://www.mersenneforum.org/forumdisplay.php?f=83)
-   -   Msieve v1.47 feedback (https://www.mersenneforum.org/showthread.php?t=13923)

jasonp 2010-09-19 00:44

Msieve v1.47 feedback
 
Just starting things up here. This version includes some tuning for Nvidia's latest graphics cards, so if you have a Fermi card and know how to run NFS polynomial selection, then please give it a try. The precompiled binary on sourceforge now has two sets of PTX files, and if you have a Fermi card and version 3.0 or later of Nvidia's CUDA toolkit, then use the ptx files in the 'fermi' subdirectory.

em99010pepe 2010-09-19 16:51

Thank you for the new version.
About MPI code, can it be used in a local area network? I want to take advantage of all computers I have in my LAN. If it is a stupid question please say so.

Jeff Gilchrist 2010-09-19 17:53

[QUOTE=em99010pepe;230463]Thank you for the new version.
About MPI code, can it be used in a local area network? I want to take advantage of all computers I have in my LAN. If it is a stupid question please say so.[/QUOTE]

Yes, you can install MPI on all your LAN machines and spread out the work via your LAN.

Jeff Gilchrist 2010-09-19 18:01

When I compile msieve with VS 2010 for 32bit Release mode using MPIR 2.1.2, it seems to work when doing QS. The 64bit version works fine and the pre-compiled 32bit from Jason. Can anyone else reproduce this with VS 32bit binary?

It crashes for me with this number:
432717521981854371284364875772580616114001895598914408712561227

[CODE]Sun Sep 19 13:56:15 2010
Sun Sep 19 13:56:15 2010
Sun Sep 19 13:56:15 2010 Msieve v. 1.47
Sun Sep 19 13:56:15 2010 random seeds: 2d89fa40 345674ae
Sun Sep 19 13:56:15 2010 factoring 432717521981854371284364875772580616114001895598914408712561227 (63 digits)
Sun Sep 19 13:56:16 2010 searching for 15-digit factors
Sun Sep 19 13:56:16 2010 commencing quadratic sieve (63-digit input)
Sun Sep 19 13:56:16 2010 using multiplier of 43
Sun Sep 19 13:56:16 2010 using VC8 32kb sieve core
Sun Sep 19 13:56:16 2010 sieve interval: 10 blocks of size 32768
Sun Sep 19 13:56:16 2010 processing polynomials in batches of 21
Sun Sep 19 13:56:16 2010 using a sieve bound of 101281 (4788 primes)
Sun Sep 19 13:56:16 2010 using large prime bound of 5064050 (22 bits)
Sun Sep 19 13:56:16 2010 using trial factoring cutoff of 22 bits
Sun Sep 19 13:56:16 2010 polynomial 'A' values have 8 factors
Sun Sep 19 13:56:28 2010 5073 relations (2328 full + 2745 combined from 23261 partial), need 4884
Sun Sep 19 13:56:28 2010 begin with 25589 relations
Sun Sep 19 13:56:28 2010 reduce to 7450 relations in 2 passes
Sun Sep 19 13:56:28 2010 attempting to read 7450 relations
Sun Sep 19 13:56:28 2010 recovered 7450 relations
Sun Sep 19 13:56:28 2010 recovered 5951 polynomials
Sun Sep 19 13:56:28 2010 attempting to build 5073 cycles
Sun Sep 19 13:56:28 2010 found 5073 cycles in 1 passes
Sun Sep 19 13:56:28 2010 distribution of cycle lengths:
Sun Sep 19 13:56:28 2010 length 1 : 2328
Sun Sep 19 13:56:28 2010 length 2 : 2745
Sun Sep 19 13:56:28 2010 largest cycle: 2 relations
Sun Sep 19 13:56:28 2010 matrix is 4788 x 5073 (0.6 MB) with weight 139188 (27.44/col)
Sun Sep 19 13:56:28 2010 sparse part has weight 139188 (27.44/col)
Sun Sep 19 13:56:28 2010 filtering completed in 4 passes
Sun Sep 19 13:56:28 2010 matrix is 4427 x 4491 (0.5 MB) with weight 119972 (26.71/col)
Sun Sep 19 13:56:28 2010 sparse part has weight 119972 (26.71/col)
Sun Sep 19 13:56:28 2010 commencing Lanczos iteration
Sun Sep 19 13:56:28 2010 memory use: 0.6 MB[/CODE]

At the Lanczos stage it crashes with this error:
[B]Unhandled exception at 0x0046b060 in msieve32.exe: 0xC0000005: Access violation reading location 0x00b4e000.[/B]

The actual violation happens in the last line of this code snippet:
[CODE]#elif defined(MSC_ASM32A)
i = 0;
ASM_M
{
push ebx
mov edi,y
lea ebx,c
mov esi,x
mov ecx,i
align 16
L0: movq mm0,[edi+ecx*8]
[/CODE]

So at the L0: stage. Here is what the memory browser is showing for values:
L0 0x0046b060 L0 void *
ecx 15355 unsigned long
edi 11730984 unsigned long
i 0 unsigned int
mm0 10956266787831142436 unsigned __int64

Any ideas?

jasonp 2010-09-19 18:09

[QUOTE=Jeff Gilchrist;230468]Yes, you can install MPI on all your LAN machines and spread out the work via your LAN.[/QUOTE]
Note that while it's possible to use a collection of machines on a LAN to run an arbitrary MPI program, in practice it will help a great deal if all the machines are similar speed and the switch you use is as fast as possible. Don't even bother with MPI on a LAN for the LA if you don't have a gigabit switch.

Jeff, does the crash happen with v1.46 too? (I would predict yes). It's possible something in the LA is not initialized properly when compiling with MSVC. Actually, a small matrix like this would use the unpacked solver code, and the crash happens in the vector-vector multiply. An array index of 15355 is a lot bigger than the matrix dimension, so is something wrapping around?

Jeff Gilchrist 2010-09-19 18:28

[QUOTE=jasonp;230472]Jeff, does the crash happen with v1.46 too? (I would predict yes). It's possible something in the LA is not initialized properly when compiling with MSVC. Actually, a small matrix like this would use the unpacked solver code, and the crash happens in the vector-vector multiply. An array index of 15355 is a lot bigger than the matrix dimension, so is something wrapping around?[/QUOTE]

Yes it does crash in v1.46 as well when compiled with MSVC. Not sure about the wrapping stuff, is there something in particular I can check for you? Brian might be able to debug this more easily.

Jeff Gilchrist 2010-09-19 18:30

New 64bit Windows binaries now available for 1.47 including both default settings and a second binary compiled with LARGEBLOCKS and TD=80

[url]http://gilchrist.ca/jeff/factoring/[/url]

Jeff.

Phil MjX 2010-09-19 22:56

Thanks Jeff,

Your binaries are always helpfull to me and I'll be happy to test the LARGEBLOCK version of your binaries with the c154 I am dealing with.

A question : I am very interested in the latest changes of the gpu enabled version of the polynomial selection, because of the introduction of a 12 multiplier instead of the "classical" 60 for the smallest coefficient.

This is the multiplier I used with ggnfs (amongst 144 and 720 to cover different search intervals).

My problem is that I am totally unable to compile gpu version of msieve with cygwin (I fear this isn't possible), and that I cannot test sources changes by myself...

I'd have a request :
Do you have the tools, and the time, to compile a 64 bits version of the gpu_enabled source ?

I currently do np1 with sourceforge 32 bits gpu'able binaries and plan to switch to np2 with yours...should it works, should the step 2 be faster this way ?

Thanks again for your work, and also a great thanks to Jason for improving msieve !

Regards,
Philippe

Brian Gladman 2010-09-19 23:06

[QUOTE=Jeff Gilchrist;230475]Yes it does crash in v1.46 as well when compiled with MSVC. Not sure about the wrapping stuff, is there something in particular I can check for you? Brian might be able to debug this more easily.[/QUOTE]

I can see this bug in win32, which I suspect is a compiler bug because the registers are not being loaded with the correct values.

Brian

jasonp 2010-09-20 00:34

Philippe, I thought of you when jrk committed this patch :) It should be possible to compile the GPU version of msieve in cygwin, but you'd need to find a way to link to Nvidia's precompiled DLLs, and must have Microsoft's compiler to run Nvidia's compiler (the free MSVC Express is what I use). Note that poly selection really does not benefit from using a 64-bit binary, though optimizations are possible in the CPU branch that can significantly improve performance.

Oh, and you're welcome :)

Brian Gladman 2010-09-20 07:08

[QUOTE=Brian Gladman;230511]I can see this bug in win32, which I suspect is a compiler bug because the registers are not being loaded with the correct values.

Brian[/QUOTE]

Sadly, this was a bug in my inline assembler code, which I have now corrected this in the msieve SVN repository. I have also corrected a minor win32 build configuration issue.

Brian


All times are UTC. The time now is 04:50.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.