mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Msieve (https://www.mersenneforum.org/forumdisplay.php?f=83)
-   -   Msieve v1.49 feedback (https://www.mersenneforum.org/showthread.php?t=15678)

em99010pepe 2011-11-05 12:17

[QUOTE=Jeff Gilchrist;277240]Are you using my binary when you are seeing this? I didn't think zlib support was part of the Windows visual studio build.[/QUOTE]

Yes.

Jeff Gilchrist 2011-11-05 23:54

[QUOTE=em99010pepe;277247]Yes.[/QUOTE]

Brian can confirm this but there is no zlib support built into the VS2010 project of msieve. I have not compiled zlib and linked it to the project.

You can confirm this by trying to compress your relations file with gzip and see if it works but I think there must be another problem as you will likely find the 64bit windows version does not have zlib support enabled in the first place.

Batalov 2011-11-06 00:35

Simply look in the source! See include/util.h and common/savefile.c

Well, ok, I'll save you - there's no gz support on Windows. It is all like
[CODE]#if defined(WIN32) || defined(_WIN64)
#include <windows.h>
#include <process.h>
#else
//...
#ifdef NO_ZLIB
//...
#else
#include "zlib.h"
#endif
[/CODE]
Instead, there's a special treatment for large files.

debrouxl 2011-11-06 10:13

I use Debian Testing & Unstable, so I didn't expect the zlib packages to be outdated... but indeed, they are !
I see that Debian Experimental has had a 1.2.5 package for one year and a half (!), but it hasn't made it to unstable yet (??)... so I guess I'll compile the source package from Experimental.

Brian Gladman 2011-11-06 10:43

The Windows build does not include ZLIB. I have a Visual Studio 2010 build for ZLIB 1.2.5 that I use elsewhere so it would be very easy to add this to msieve if Jason was happy with this and others wanted it.

jasonp 2011-11-06 13:30

We've honestly had a lot of trouble with distributions using old to ancient versions of zlib. If it isn't a lot of source then we should add it.

axn 2011-11-06 14:52

[QUOTE=jasonp;277350]We've honestly had a lot of trouble with distributions using old to ancient versions of zlib. If it isn't a lot of source then we should add it.[/QUOTE]

What is the compression ratios you're seeing from zlib? If it is something like 50%, the same could probably be obtained by just saving the relation in pure binary format. In cases like these, where we have additional knowledge of the domain, rolling your own compression is not a bad thing.

chris2be8 2011-11-06 15:49

Saving the relations in binary format makes it harder to fix problems with them. Eg to get rid of duplicates:
sort -ur name.dat >name.sorted
mv name.dat name.dat.old
mv name.sorted name.dat

I'd prefer an option to do incremental duplicate removal. So if 1 msieve run has too few relations the next run doesn't repeat all the work removing duplicate relations. Eg write them in to a b-tree file on disk and report how many were new.

Chris K

jasonp 2011-11-06 16:59

You can get arbitrarily fancy with the way relations are stored, and what you can do with them. I've actually considered linking Berkeley DB as a data store and constructing queries to perform the filtering, but

- it's a lot of coding work, and most of what the filtering does now just involves iterating through all the relations, so a database is overkill. I doubt the core of the merge phase would be more efficient than it is now, but that's a short computation compared to everything else
- the SleepyCat license is too restrictive for software that is otherwise public domain

That being said, it would be cool to have a transaction-based system for handling things like duplicate and singleton or clique removal.

Lately I've been wondering what would have to change in Msieve to be able to handle much bigger problems than it currently can. Parallel filtering and a parallel matrix build would be a necessity, and those depend on a scalable data store or very careful coding.

Brian Gladman 2011-11-06 18:18

[QUOTE=jasonp;277350]We've honestly had a lot of trouble with distributions using old to ancient versions of zlib. If it isn't a lot of source then we should add it.[/QUOTE]

There are about 25 architecture independent C files and 2 or 3 OS related files that can be in C or assembler.

If we add it, I can easily do the VC++ stuff but someone else would have to do the Linux/Unix stuff as there is quite a bit of stuff that I know nothing about (for configure etc.)

Alternatively I can just add an option to link the Windows version to an external ZLIB library.

Batalov 2011-11-06 22:43

[QUOTE=jasonp;277350]We've honestly had a lot of trouble with distributions using old to ancient versions of zlib. If it isn't a lot of source then we should add it.[/QUOTE]
One easy thing to add is
[CODE]//...
#include "zlib.h"
#if ZLIB_VERNUM<0x1250
#error Don't use this ZLIB library; update to 1.2.5 or compile with NO_ZLIB=1
#endif
[/CODE]...or something like. It is hard to understand why distro preparers linger with 1.2.3 but there's not much we can do.

In retrospect, the one attractive thing about adding ZLIB was that it was initially just a dozen lines of code (but then the file append bug/feature was found and that addition became another dozen lines longer). The downside, of course, yes, even though it is a tiny lib, "anything that can go wrong, will go wrong". And it did. Old versions, mismatched header/libs - all of that can happen to zlib as well as to libgmp or libecm or anything else external that one would chose to link in.


All times are UTC. The time now is 04:52.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.