mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   CADO-NFS (https://www.mersenneforum.org/forumdisplay.php?f=170)
-   -   CADO NFS (https://www.mersenneforum.org/showthread.php?t=11948)

Shaopu Lin 2009-05-26 01:09

CADO NFS
 
The cado nfs suite is now available from [url]http://cado.gforge.inria.fr[/url].

Jeff Gilchrist 2009-05-26 17:56

[QUOTE=Shaopu Lin;174811]The cado nfs suite is now available from [url]http://cado.gforge.inria.fr[/url].[/QUOTE]

Anyone have success building this? I keep getting errors during the build process. Seems quite complex with pthreads and MPI versions available.

CRGreathouse 2009-05-26 18:52

Heh, I can't even set up my networking for it properly, let alone build it.

10metreh 2009-05-26 19:13

How fast is it actually meant to be?

KriZp 2009-05-26 20:24

It built fine for me by just typing "make" (it downloaded CMAKE and built it first), and after I got the ssh-agent working it ran fine on localhost, factoring the c59 example provided. I have been unable to figure out how to make use of remote hosts, the syntax of the mach_desc file is not explained anywhere.

KriZp 2009-05-26 23:39

It was simply a matter of putting the executables on the remote host and editing the machine description part of the run_example.sh script to include the lines
[code][remote]
tmpdir=$t/tmp
cadodir=/path/to/build/directory/
remote_host_name cores=1[/code]

It then used 1 remote and 1 local core for polyselect, 2 local and 1 remote core for the sieving, and 1 local for the rest.

frmky 2009-05-27 00:04

CADO NFS
 
Moving the discussion of CADO NFS out of the Links thread...

I've downloaded the source, compiled it using pthreads, and have successfully ran a GNFS factorization using the included perl script. I have also noticed that the poly file format and relation format matches that of GGNFS. (Thanks for that!) I have not yet figured out how (1) given a polynomial file, do a complete SNFS run, and (2) given a polynomial file and set of relations that possibly includes duplicates and bad relations, do all post-processing steps. Any guidance?

Once I know how to do (2), I will determine how well bwc runs on our workstation with up to 32 threads, and on our beowulf cluster of 10x4 cores.

akruppa 2009-05-27 00:26

[QUOTE=10metreh;174902]How fast is it actually meant to be?[/QUOTE]

The siever can't compete with Franke/Kleinjung's siever yet. It's slower and uses much more memory. The core sieving routines need a complete overhaul. Embarrassingly, it sieves special-q only on the algebraic side so far.

[QUOTE=frmky;174928]I have not yet figured out how (1) given a polynomial file, do a complete SNFS run, and (2) given a polynomial file and set of relations that possibly includes duplicates and bad relations, do all post-processing steps. Any guidance?
[/QUOTE]

The perl script keeps track of which tasks are already done by <prefix>.<task>_done files, so you can write your own poly file and "touch <prefix>.polysel_done" (e.g., "touch 797161_29.polysel_done"). The perl script should generate the factor base and start sieving. If you already have relations, you should be able to copy your own files (matching the naming scheme of the perl script, e.g., "797161_29.rels.9000000-9100000") and simply run the perl script again. It should check the relation files, count how many relations there are, start sievers, and if there are enough relations, try a filtering run. Warning: a file that contains bad relations is deleted. In fact, the script is a bit over-eager "cleaning up" sometimes, [B]keep backups[/B]!

More tomorrow,

Alex

R.D. Silverman 2009-05-27 11:04

[QUOTE=akruppa;174930]The siever can't compete with Franke/Kleinjung's siever yet. It's slower and uses much more memory. The core sieving routines need a complete overhaul. Embarrassingly, it sieves special-q only on the algebraic side so far.



The perl script keeps track of which tasks are already done by <prefix>.<task>_done files, so you can write your own poly file and "touch <prefix>.polysel_done" (e.g., "touch 797161_29.polysel_done"). The perl script should generate the factor base and start sieving. If you already have relations, you should be able to copy your own files (matching the naming scheme of the perl script, e.g., "797161_29.rels.9000000-9100000") and simply run the perl script again. It should check the relation files, count how many relations there are, start sievers, and if there are enough relations, try a filtering run. Warning: a file that contains bad relations is deleted. In fact, the script is a bit over-eager "cleaning up" sometimes, [B]keep backups[/B]!

More tomorrow,

Alex[/QUOTE]

Is this a posix archive? The version of tar that I have will not read --posix
archives.

Indeed, after I gunzipped the file, neither tar -x not tar -t works on the
file; tar just sits there and does nothing.

akruppa 2009-05-27 11:29

I think it's a GNU tar archive... what version of tar are you using? Is a GNU version of tar installed somewhere, maybe named "gtar" ?

Alex

R.D. Silverman 2009-05-27 11:42

[QUOTE=akruppa;174970]I think it's a GNU tar archive... what version of tar are you using? Is a GNU version of tar installed somewhere, maybe named "gtar" ?

Alex[/QUOTE]


It is GNU tar 1.12

akruppa 2009-05-27 11:55

The GNU tar format changed in version 1.13, I put an archive in the old format at [url]http://www.loria.fr/~kruppaal/cado-nfs-r2150.oldtar.gz[/url]

Alex

frmky 2009-05-27 22:43

1 Attachment(s)
I put together this sequence of operations to use the CADO binaries. The parameters below came from the c59 example, and therefore are probably not appropriate for larger numbers. Starting with a polynomial in snfs.poly and a pile of relations in snfs.dat, do the following:

[CODE]../bin/sieve/makefb -poly snfs.poly > snfs.roots
../bin/sieve/freerel -poly snfs.poly -fb snfs.roots > snfs.freerels
../verify snfs.dat snfs.rels
../bin/merge/duplicates -nrels "$( wc -l snfs.dat)" -out ./snfs.nodup.gz snfs.freerels snfs.rels 1> /dev/null
../bin/merge/purge -poly snfs.poly -nrels "$( zcat snfs.nodup.gz | wc -l)" -out snfs.purged snfs.nodup.gz
../bin/merge/merge -out snfs.merge.his -mat snfs.purged -forbw 1 -keep 160 -maxlevel 15 -cwmax 200 -rwmax 200 -ratio 1.5
../bin/merge/replay -his snfs.merge.his -index snfs.index -purged snfs.purged -out snfs.small -costmin "$( tail -n 1 snfs.merge.his | sed 's/BWCOSTMIN: //')"
../bin/linalg/bwc/bwc.pl :complete seed=1 thr=2x2 mpi=1x1 matrix=snfs.small nullspace=left mm_impl=sliced interleaving=0 interval=100 mode=u64 mn=64 splits=0,64 ys=0..64 wdir=bwc
../bin/linalg/apply_perm --perm bwc/mat.row_perm --in bwc/W.twisted --out W.bin
xxd -c 8 -ps W.bin > snfs.W
../bin/linalg/mkbitstrings snfs.W > snfs.ker_raw
../bin/linalg/characters -poly snfs.poly -purged snfs.purged -ker snfs.ker_raw -index snfs.index -rel snfs.nodup.gz -small snfs.small -nker 64 -skip 32 -nchar 50 -out snfs.ker
../bin/sqrt/allsqrt snfs.nodup.gz snfs.purged snfs.index snfs.ker snfs.poly 0 10 ar snfs.dep
../bin/sqrt/algsqrt snfs.dep.alg.000 snfs.dep.rat.000 snfs.poly 1>> snfs.fact
../bin/sqrt/algsqrt snfs.dep.alg.001 snfs.dep.rat.001 snfs.poly 1>> snfs.fact [/CODE]

verify is a program that eliminates free relations added by msieve and removes obviously bad relations. I'm surprised that duplicates doesn't verify the relations as it looks for duplicates and removes bad ones. That seems like something useful to add. The source for verify is attached.

The 10 in the arguments of allsqrt can be increased to prep more dependencies for algsqrt. Likewise, run as many algsqrt's as necessary to get the factors.

Having said all of this, it worked for a small example but when I tried a larger example, all of the algsqrt's failed with "condition (nab & 1) == 0 failed" or "the squares do not agree modulo n!" so there are probably one or more of the parameters that are completely inappropriate for larger numbers.

jasonp 2009-05-28 01:57

Alex, if lots of people here start using the CADO tools, perhaps you shouldn't be the only one from LORIA fielding help requests :)

thome 2009-05-28 13:31

Admittedly the documentation is too scarce. So I might occasionally help with the difficulties people encounter.

By the way, I've posted an updated tarball which should now be compatible with tar-1.12 as well.

E.

R.D. Silverman 2009-05-28 13:35

[QUOTE=akruppa;174972]The GNU tar format changed in version 1.13, I put an archive in the old format at [url]http://www.loria.fr/~kruppaal/cado-nfs-r2150.oldtar.gz[/url]

Alex[/QUOTE]

When I extracted everything from the tar file, It reported that it
could not find

polyselect/aux.c
polyselect/aux.h

xilman 2009-05-28 13:41

[QUOTE=R.D. Silverman;175074]When I extracted everything from the tar file, It reported that it
could not find

polyselect/aux.c
polyselect/aux.h[/QUOTE]A well-known problem in multi-architecture installations.

Under MS-DOG and its successors, at least as far as Vista, the files "aux" and "aux.*' are special --- they are actually devices. Various other names, such as con, and lpt, are also special. Actually, you were slightly fortunate --- I've seen applications lock solid when they attempt to access aux.*.

There are two solutions.

1) Extract the archive on a non-MS operating system, rename the files and then transfer everything to your Windoze machine.

2) Ask the CADO people very nicely whether they will consider renaming the files in subsequent releases, and then waiting until they do so.


Paul

thome 2009-05-28 14:10

[quote=xilman;175075]
2) Ask the CADO people very nicely whether they will consider renaming the files in subsequent releases, and then waiting until they do so.
[/quote]

Done, that was easy.

Although my very clear bet is that there is no chance the thing works in non-unix environments.

It's been extensively tested on linux/x86_64 (primary platform), with various compiler/linker combinations. It's regularly tested on linux/x86_32, macos/x86_64. We've also had successes on freebsd and openbsd, but that was a while ago, and not re-checked on a regular basis.

E.

R.D. Silverman 2009-05-28 15:38

[QUOTE=xilman;175075]A well-known problem in multi-architecture installations.

Under MS-DOG and its successors, at least as far as Vista, the files "aux" and "aux.*' are special --- they are actually devices. Various other names, such as con, and lpt, are also special. Actually, you were slightly fortunate --- I've seen applications lock solid when they attempt to access aux.*.

There are two solutions.

1) Extract the archive on a non-MS operating system, rename the files and then transfer everything to your Windoze machine.

2) Ask the CADO people very nicely whether they will consider renaming the files in subsequent releases, and then waiting until they do so.


Paul[/QUOTE]

Or... go into the actual tar file [b]before[/b] extraction, and manually
extract the files with an editor......

R.D. Silverman 2009-05-28 15:40

[QUOTE=thome;175083]Done, that was easy.

Although my very clear bet is that there is no chance the thing works in non-unix environments.

It's been extensively tested on linux/x86_64 (primary platform), with various compiler/linker combinations. It's regularly tested on linux/x86_32, macos/x86_64. We've also had successes on freebsd and openbsd, but that was a while ago, and not re-checked on a regular basis.

E.[/QUOTE]

With your permission, I will *try* to get it to work under WINDOZE.

pthreads will be a problem........

xilman 2009-05-28 15:59

[QUOTE=R.D. Silverman;175093]Or... go into the actual tar file [b]before[/b] extraction, and manually
extract the files with an editor......[/QUOTE]Ooh! That's much deeper magic than almost all Windoze people are prepared to attempt. :geek:

You could also edit the filenames in the tarball. It's probably best to keep the lengths the same, so "sux.{h,c}" is the obvious candidate.

You'll need to fix up all the other files (Makefiles especially) accordingly. If you don't, they'll fail to find the evil twins. If you rename the aux files you're back into the original situation and, if you are lucky, the app won't lock solid when you try to access them.


Paul

R.D. Silverman 2009-05-28 16:08

[QUOTE=xilman;175098]Ooh! That's much deeper magic than almost all Windoze people are prepared to attempt. :geek:

You could also edit the filenames in the tarball. It's probably best to keep the lengths the same, so "sux.{h,c}" is the obvious candidate.

You'll need to fix up all the other files (Makefiles especially) accordingly. If you don't, they'll fail to find the evil twins. If you rename the aux files you're back into the original situation and, if you are lucky, the app won't lock solid when you try to access them.


Paul[/QUOTE]

I will not be using the make files. I will do this under VC++ and use
windoze .dsw files..........

This will take some time.. I have little enough of it.

thome 2009-05-28 16:23

[quote=R.D. Silverman;175100]I will not be using the make files. I will do this under VC++ and use
windoze .dsw files..........

This will take some time.. I have little enough of it.[/quote]

I have zero experience with windows, but I feel that it's a lot of work (not to discourage you -- much the contrary). Note that cmake claims to be able to build VC++ project files, so maybe you could try that.

E.

thome 2009-05-28 19:08

[quote=frmky;175023]
Having said all of this, it worked for a small example but when I tried a larger example, all of the algsqrt's failed with "condition (nab & 1) == 0 failed" or "the squares do not agree modulo n!" so there are probably one or more of the parameters that are completely inappropriate for larger numbers.[/quote]

There are some known issues with snfs factorizations. sqrt code is probably over-zealous on some assumptions.

If you still have an example polynomial which fails, we can try to address the problem you encounter.

E.

frmky 2009-05-30 07:20

[QUOTE=thome;175123]There are some known issues with snfs factorizations. sqrt code is probably over-zealous on some assumptions.

If you still have an example polynomial which fails, we can try to address the problem you encounter.

E.[/QUOTE]

I must be doing something wrong, then, as a GNFS case just failed the same way. I can post the relations for you to download if you've got a fast enough connection to download a total of 1GB.

fivemack 2009-05-31 15:03

I'm running through a medium SNFS example I have lying around (fibonacci(1039)); the matrix comes out as

[code]
Renumbering columns (including sorting w.r.t. weight)
Sorting columns by decreasing weight
small_nrows=5582216 small_ncols=5582056
Writing sparse representation to file
#T# writing sparse: 80.00
# Weight(M_small) = 259518490
Writing index file
#T# writing index file: 6.20
[/code]

versus

[code]
Mon May 18 09:14:02 2009 matrix is 4854784 x 4855032 (1383.4 MB) with weight 336557789 (69.32/col)
Mon May 18 09:14:02 2009 sparse part has weight 314102671 (64.70/col)
Mon May 18 09:14:02 2009 matrix includes 64 packed rows
[/code]

However, I wanted to see which of the bw routines was the sensible one to use on i7 for a matrix of this size, so did

[code]
nfsslave2@cow:/scratch/fib1039/with-cado$ /home/nfsslave2/cado/cado-nfs-20090528-r2167/build/cow/linalg/bwc/u128_bench snfs.small
[/code]

and got
[code]
Using implementation "sliced"
no cache file snfs.small-sliced.bin
T0 Building cache file for snfs.small
code BUG() : condition x >> 16 == 0 failed in push at /home/nfsslave2/cado/cado-nfs-20090528-r2167/linalg/bwc/matmul-sliced.cpp:108 -- Abort
Aborted
[/code]

What am I doing wrong?

Chris Card 2009-05-31 18:06

[QUOTE=Jeff Gilchrist;174895]Anyone have success building this? I keep getting errors during the build process. Seems quite complex with pthreads and MPI versions available.[/QUOTE]

I tried to build it on Linux with gcc 4.4.0, and I got a couple of errors:
1./home/chris/cado-nfs-20090528-r2167/utils/ularith.h: In function ‘ularith_sqr_ul_2ul’:
/home/chris/cado-nfs-20090528-r2167/utils/ularith.h:386: error: ‘%’ constraint used with last operand
which was easily fixed by changing "%0" to "0" and
2. /home/chris/cado-nfs-20090528-r2167/utils/modredc_15ul.c: In function ‘modredc15ul_isprime’:
/home/chris/cado-nfs-20090528-r2167/utils/modredc_15ul.c:274: error: expected declaration or statement at end of input
which has got me stumped because I can't even see an implementation of modredc15ul_isprime anywhere, let alone in that file.

Chris

akruppa 2009-05-31 18:54

The _isprime() functions is defined in modredc_2ul_common.c and gets renamed according to the modulus width that is used for the arithmetic. From the first error I take it that you're doing a 32 bit build (and with gcc 4.4.0, no less - good luck!). It's probably just a missing bracket in the asm macros, I'll look.

EDIT: modredc_2ul_common.c needs a } in line 915, after line "mod_intcmp (n, c5783688565841) != 0;"

Edit2: Tom, your question is strictly Manu's territory.

Alex

fivemack 2009-05-31 21:32

I'm not very good on French nicknames; is Manu Emmanuel Thomé?

Is there actually any point in people outside the development team attempting to run this code on reasonably-sized (days rather than minutes or weeks of linalg) examples yet?

thome 2009-06-02 11:09

[quote=fivemack;175421]I'm not very good on French nicknames; is Manu Emmanuel Thomé?[/quote]

yes.

[quote=fivemack;175421]
Is there actually any point in people outside the development team attempting to run this code on reasonably-sized (days rather than minutes or weeks of linalg) examples yet?[/quote]

absolutely.

For the problem you encounter, the fix is quite trivial. The ``good'' low-level code for matrix multiplication is not (yet) the default one. The default, a.k.a. ``sliced'' variant, chokes on matrices having zero blobs larger than 4096*65536. In order to use the better, faster variant called ``bucket'', add bwc_mm_impl=bucket to the params file. Or change the default in linalg/bwc/matmul.c.

E.

thome 2009-06-02 11:15

[quote=fivemack;175400]
However, I wanted to see which of the bw routines was the sensible one to use on i7 for a matrix of this size, so did

[code]
nfsslave2@cow:/scratch/fib1039/with-cado$ /home/nfsslave2/cado/cado-nfs-20090528-r2167/build/cow/linalg/bwc/u128_bench snfs.small
[/code]and got
[code]
Using implementation "sliced"
no cache file snfs.small-sliced.bin
T0 Building cache file for snfs.small
code BUG() : condition x >> 16 == 0 failed in push at /home/nfsslave2/cado/cado-nfs-20090528-r2167/linalg/bwc/matmul-sliced.cpp:108 -- Abort
Aborted
[/code]What am I doing wrong?[/quote]

More in detail. You're doing it right. It's just that this particular variant can't cope with this matrix. It's a good thing to try out the different options with the *_bench programs (in that particular case, u128_bench uses sse-2).

Some defaults are set in matmul-bucket.cpp, you might also try to change them a bit -- notably the CUTOFF2 value, which is perhaps still a bit large. Such changes require to rebuild the cached matrix file called mat.h#.v#-something.bin ; so before re-running uXXX_bench, rm that file first.

E.

joral 2009-06-02 15:26

I'm running into a different issue, attempting to run the process manually.

When I try to run the linear algebra, I get an error about

fopen(mat.info)

saying there is not such file. This is from memory, but I can get the exact error when I get home.

Is there some other process that I missed?

thome 2009-06-02 19:18

[quote=joral;175637]I'm running into a different issue, attempting to run the process manually.

When I try to run the linear algebra, I get an error about

fopen(mat.info)

saying there is not such file. This is from memory, but I can get the exact error when I get home.

Is there some other process that I missed?[/quote]

this is a shortcoming in the perl driver script, which I'm going to fix. A program called ``balance'' must have been run prior to reaching the point which reads mat.info (presumably ``u64n_prep''). ``balance'' populates the working directory with the matrix files. Unfortunately, the driver script is very crude and is content with an existing, yet empty subdir as a proof of completion for ``balance''. So pending a less crude behaviour from the bwc.pl script (which should come RSN), you may either:

- rmdir the empty directory (something like /path/to/matrix-<number>x<number>) and rerun bwc.pl with the same command line.

- run bwc.pl with ``:balance'' instead of ``:complete'' first. This will add the missing files.

This being said, unless you've really try to run the stuff manually and got it wrong, there's a possibility that the ``balance'' program wasn't found because of a nasty shell variable expansion problem which has been fixed last week. So I'll refresh the snapshot anyway, there's good chance your problem will be fixed.

E.

joral 2009-06-02 19:49

No, I thought the working directory needed to previously exist, so it was there, but empty. This then showed up when running bwc.pl with :complete.

Thanks for the information. As I get further along I may have more errors to ask about.

fivemack 2009-06-02 22:11

I don't quite understand the benchmark output
 
OK: I'm using the command line

[code]
% /home/nfsslave2/cado/cado-nfs-20090528-r2167/build/cow/linalg/bwc/u128_bench snfs.small -impl bucket
[/code]

and getting plausible diagnostics rather than error messages. Trying all the *_bench tools, deleting snfs.small-bucket.bin between runs:

u128
T0 snfs.small: 5582216 rows 5582056 cols 259518490 coeffs
22 iterations in 101s, 4.61/1, 17.77 ns/coeff

u64
T0 snfs.small: 5582216 rows 5582056 cols 259518490 coeffs
38 iterations in 102s, 2.67/1, 10.29 ns/coeff

u64k says 'T0 : Check failed Aborted'

u64n also says this.

I assume u128 would want to do exactly half as many iterations as u64, so would be quicker in total; should I be getting a 'k' or 'n' parameter to u64k or u64n in some way?

If u128 does 5582216/128 iterations, the total runtime would be ~200k seconds, which seems pretty good since msieve lanczos took 108242 wall-time seconds with four threads - but I'm not sure whether there's not another factor two hiding somewhere in the block Wiedemann algorithm.

So, time to try threading.

[code]
/home/nfsslave2/cado/cado-nfs-20090528-r2167/build/cow/linalg/balance --in snfs.small --out cabbage --nslices 2x2 --ramlimit 8G
[/code]

gives me a message 'Matrix has more rows than columns \n Perhaps the matrix should have been transposed first', and produces cabbage.row_perm, cabbage.col_perm and cabbage.h[01].v[01]. Then

[code]
taskset 0f /home/nfsslave2/cado/cado-nfs-20090528-r2167/build/cow/linalg/bwc/u128_bench -impl bucket -nthreads 4 -- cabbage.h0.v0 cabbage.h1.v0 cabbage.h0.v1 cabbage.h1.v1
[/code]

runs with occasionally 400% CPU and says

19 iterations in 102s, 5.35/1, 20.62 ns/coeff

Does this mean that threads are treading on one another's toes and four threads are slower than one, or that each thread has done 19 iterations in 102 seconds for a total speed of effectively 5.16 ns/coeff ?

joral 2009-06-02 23:46

Ok. A little farther.

Now I have had the following:

Computing trsp(x)*M^100
..........Warning: Doing many iterations with bad code
Warning: Doing many iterations with bad code
Warning: Doing many iterations with bad code
Warning: Doing many iterations with bad code

Then a little later...

Failed check at iteration 100
/cado-nfs/linalg/bwc/u64_krylov: exited with status 1

Tried with a different seed, and it failed at iteration 1900.

I know I had trouble with the msieve version of block lanczos if the matrix was too sparse, I believe it was. Is there a similar condition here which could cause it to fail?

thome 2009-06-03 08:45

[quote=fivemack;175686]OK: I'm using the command line

[code]
% /home/nfsslave2/cado/cado-nfs-20090528-r2167/build/cow/linalg/bwc/u128_bench snfs.small -impl bucket
[/code]and getting plausible diagnostics rather than error messages. Trying all the *_bench tools, deleting snfs.small-bucket.bin between runs:

u128
T0 snfs.small: 5582216 rows 5582056 cols 259518490 coeffs
22 iterations in 101s, 4.61/1, 17.77 ns/coeff

u64
T0 snfs.small: 5582216 rows 5582056 cols 259518490 coeffs
38 iterations in 102s, 2.67/1, 10.29 ns/coeff
[/quote]

ok -- which kind of cpu is this ? These figures seem a bit large.

[quote]
u64k says 'T0 : Check failed Aborted'

u64n also says this.
[/quote]bug. u64k and u64n make little sense for benches, but that's definitely a bug. I'll try to reproduce it.

[quote]
I assume u128 would want to do exactly half as many iterations as u64, so would be quicker in total; should I be getting a 'k' or 'n' parameter to u64k or u64n in some way?
[/quote]For information, the k in u64k is hard-coded (anyway this code is never used). Setting n for u64n_bench is done with --nbys=128 (for n=2).

[quote]
If u128 does 5582216/128 iterations, the total runtime would be ~200k seconds, which seems pretty good since msieve lanczos took 108242 wall-time seconds with four threads - but I'm not sure whether there's not another factor two hiding somewhere in the block Wiedemann algorithm.
[/quote]N/m+N/n+N/n -- so three times as much. But I wonder. Your timings exceed what I get normally, so perhaps there's something wrong somewhere. Was your matrix transposed ? If not, i.e. if relation-sets are rows and ideals are columns, then you should use the -t option to the bench program, otherwise the matrix gets organized the wrong way around.

[quote]
So, time to try threading.

[code]
/home/nfsslave2/cado/cado-nfs-20090528-r2167/build/cow/linalg/balance --in snfs.small --out cabbage --nslices 2x2 --ramlimit 8G
[/code]gives me a message 'Matrix has more rows than columns \n Perhaps the matrix should have been transposed first',
[/quote]This warning is innocuous cruft, since bwc tools now properly handle matrices in both directions -- although this hints at the fact the arguments you've tried don't direct them to do so.

[quote]
and produces cabbage.row_perm, cabbage.col_perm and cabbage.h[01].v[01]. Then

[code]
taskset 0f /home/nfsslave2/cado/cado-nfs-20090528-r2167/build/cow/linalg/bwc/u128_bench -impl bucket -nthreads 4 -- cabbage.h0.v0 cabbage.h1.v0 cabbage.h0.v1 cabbage.h1.v1
[/code]runs with occasionally 400% CPU and says

19 iterations in 102s, 5.35/1, 20.62 ns/coeff

Does this mean that threads are treading on one another's toes and four threads are slower than one, or that each thread has done 19 iterations in 102 seconds for a total speed of effectively 5.16 ns/coeff ?[/quote]The number of seconds here (102, 5.35) is cpu, not wct. So four threads do effectively one iteration every 1.34s wct, which isn't exactly 4times better than 1thread, but relatively acceptable. Threads do tread on one another's toes indeed, because of the memory access penalties. Since the penalty is not large here, I suppose you have opterons maybe.

E.

thome 2009-06-03 08:49

[quote=joral;175689]Ok. A little farther.

Now I have had the following:

Computing trsp(x)*M^100
..........Warning: Doing many iterations with bad code
Warning: Doing many iterations with bad code
Warning: Doing many iterations with bad code
Warning: Doing many iterations with bad code
[/quote]

Normal for the u64_secure program. It effectively does transposed multiplications, which are somewhat slower.

[quote]
Then a little later...

Failed check at iteration 100
/cado-nfs/linalg/bwc/u64_krylov: exited with status 1

Tried with a different seed, and it failed at iteration 1900.
[/quote]That's a problem. The fact that it doesn't even deterministically fails suggest that perhaps your RAM could be accused, but I wouldn't conclude that too soon.

Care for sharing your matrix ?

[quote]
I know I had trouble with the msieve version of block lanczos if the matrix was too sparse, I believe it was. Is there a similar condition here which could cause it to fail?[/quote]If it's very sparse, and if I got padding coeffs wrong in some corner case, maybe, but I doubt it.

E.

thome 2009-06-03 10:52

[quote=fivemack;175686]
u64k says 'T0 : Check failed Aborted'

u64n also says this.
[/quote]

Now fixed. Thanks.

thome 2009-06-03 10:53

[quote=joral;175689]Computing trsp(x)*M^100
..........Warning: Doing many iterations with bad code
Warning: Doing many iterations with bad code
Warning: Doing many iterations with bad code
Warning: Doing many iterations with bad code
[/quote]

This warning no longer appears (yes, there's a new tarball).

E.

joral 2009-06-03 10:56

[QUOTE]The fact that it doesn't even deterministically fails suggest that perhaps your RAM could be accused, but I wouldn't conclude that too soon.[/QUOTE]

I'm going to run some more tests to be sure, but as I recall it is deterministic in this:

If I leave the seed parameter unchanged, it always fails at the same iteration.

It's about a 280 Mb matrix file ungzipped, so I'll see what it compresses to and where I can put it.

fivemack 2009-06-03 11:12

The machine I'm doing the benchmarks on is a single-socket 2.66GHz Core i7 (256k 10-cycle L2 cache per core + 8192k 19-cycle L3 cache per four cores + 12G DDR3/8500); I am a little surprised that I don't have to give a load of cache parameters to bench, if it's running one thread blocking for the 256k cache rather than the 8192k one then I could understand it being a bit slow.

Will try more sensible benchmarks (correct transpose parameters, trying 1x4 2x2 4x1 decompositions on four cores and 1x8 2x4 4x2 8x1 decompositions on eight-threads-on-four-cores) with new tarball tonight; I've left a make-matrix-from-relations job running today on a set of relations from a very large SNFS job, and will mention if that falls over in interesting ways. It's using an awful lot of memory (17G vsize, 10G rsize), but I have an awful lot of memory and a fast swap disc.

jasonp 2009-06-03 13:28

In case it becomes an issue: the latest GGNFS lattice sievers do not print all the factors of relations; they skip multiplicity beyond 1 and skip printing factors smaller than 1000, so that both of these have to be rediscovered by any relation-reading code.

fivemack 2009-06-03 17:13

Benchmark with -t fails entirely
 
I issue the command

[code]
nfsslave2@cow:/scratch/fib1039/with-cado$ /home/nfsslave2/cado/cado-nfs-20090603-r2189/build/cow/linalg/bwc/u128_bench -t --impl bucket snfs.small
[/code]

and it produces a lot of output at the 'large' level before failing with

[code]
Lsl 56 cols 3634827..3699734 w=778884, avg dj=7.2, max dj=34365, bucket hit=1/1834.7-> too sparse
Switching to huge slices. Lsl 56 to be redone
Flushing 56 large slices
Hsl 0 cols 3634827..5582056 (30*64908) ..............................
w=16383453, avg dj=0.3, max dj=29376, bucket block hit=1/10.2
u128_bench: /home/nfsslave2/cado/cado-nfs-20090603-r2189/linalg/bwc/matmul-bucket.cpp:610: void split_huge_slice_in_vblocks(builder*, huge_slice_t*, huge_slice_raw_t*, unsigned int): Assertion `(n+np)*2 == (size_t) (spc - sp0)' failed.
Aborted
[/code]

The enormous filtering run got terminated by something that kills SSH sessions that have produced no output for ages, will try that again.

thome 2009-06-03 22:17

[quote=fivemack;175756]I issue the command

[code]
nfsslave2@cow:/scratch/fib1039/with-cado$ /home/nfsslave2/cado/cado-nfs-20090603-r2189/build/cow/linalg/bwc/u128_bench -t --impl bucket snfs.small
[/code]and it produces a lot of output at the 'large' level
[/quote]

normal behaviour (admittedly way too verbose).

[quote]before failing with

[code]
Lsl 56 cols 3634827..3699734 w=778884, avg dj=7.2, max dj=34365, bucket hit=1/1834.7-> too sparse
Switching to huge slices. Lsl 56 to be redone
Flushing 56 large slices
Hsl 0 cols 3634827..5582056 (30*64908) ..............................
w=16383453, avg dj=0.3, max dj=29376, bucket block hit=1/10.2
u128_bench: /home/nfsslave2/cado/cado-nfs-20090603-r2189/linalg/bwc/matmul-bucket.cpp:610: void split_huge_slice_in_vblocks(builder*, huge_slice_t*, huge_slice_raw_t*, unsigned int): Assertion `(n+np)*2 == (size_t) (spc - sp0)' failed.
Aborted
[/code][/quote]

If you could put your failing snfs.small file somewhere where I can grab it, it would be great.

[quote]The enormous filtering run got terminated by something that kills SSH sessions that have produced no output for ages, will try that again.[/quote]

You mean, the cado filtering programs got killed prematurely ? That would have a tendency to truncate the input to the bwc executables, but I doubt this is the cause, since the balancing program would have choked first.

Thanks for your patient investigations...

E.

fivemack 2009-06-03 22:55

[quote]
[quote]
The enormous filtering run got terminated by something that kills SSH sessions that have produced no output for ages, will try that again.
[/quote]
You mean, the cado filtering programs got killed prematurely ? That would have a tendency to truncate the input to the bwc executables, but I doubt this is the cause, since the balancing program would have choked first.
[/quote]

I wasn't using the script, just running

[code]
~/cado/cado-nfs-20090603-r2189/build/cow/merge/purge -poly snfs.poly -nrels "$( zcat snfs.nodup.gz | wc -l)" -out snfs.purged snfs.nodup.gz > purge.aus 2> purge.err &
[/code]

on a file with half a billion relations without using nohup, and the ssh connection from which I'd started it died.

I'm rerunning it, but the second pass is using 25G of vsize and the machine is swapping terribly, so I'm not expecting much progress.

fivemack 2009-06-03 23:00

[QUOTE=thome;175796]If you could put your failing snfs.small file somewhere where I can grab it, it would be great[/QUOTE]

anonymous ftp to fivemack.dyndns.org and collect snfs.small.bz2 (710MB) and snfs.poly. My upload is quite slow so it may take a little while, I don't know if my ftp server supports resumption of partial transfers, if it gets frustrating tell me and I'll stick the file somewhere more accessible.

fivemack 2009-06-03 23:14

Tiny command-line bug for transpose tool
 
Since 'balance' doesn't appear to have a --transpose command-line option

[code]
nfsslave2@cow:/scratch/fib1039/with-cado$ /home/nfsslave2/cado/cado-nfs-20090528-r2167/build/cow/linalg/balance --transpose --in snfs.small --out cabbage --nslices 1x4 --ramlimit 1G
Unknown option: snfs.small
Usage: ./bw-balance <options>
Typical options:
--in <path> input matrix filename
--out <path> output matrix filename
--nslices <n1>[x<n2>] optimize for <n1>x<n2> strips
--square pad matrix with zeroes to obtain square size
More advanced:
--remove-input remove the input file as soon as possible
--ram-limit <nnn>[kmgKMG] fix maximum memory usage
--keep-temps keep all temporary files
--subdir <d> chdir to <d> beforehand (mkdir if not found)
--legacy produce only one jumbo matrix file
[/code]

I ran

/home/nfsslave2/cado/cado-nfs-20090528-r2167/build/cow/linalg/transpose --in snfs.small --out snfs.small.T

I killed the job after two hours; it was stuck in the argument-parsing loop! Using only one minus sign before 'in' and 'out' made it work, though it's now too late to start the balancing and bench jobs this evening. More later.

fivemack 2009-06-03 23:57

what shape to use for decomposition?
 
I think 100 seconds is too short for statistically significant comparisons for matrices this big, but (with four threads at each size)

- 1x4 decomposition: 19 iterations in 104s, 5.47/1, 21.07 ns/coeff
- 2x2 decomposition: 19 iterations in 102s, 5.34/1, 20.59 ns/coeff
- 4x1 decomposition: 20 iterations in 104s, 5.18/1, 19.95 ns/coeff

20ns/coeff still feels a bit too long.


To my limited surprise, explicitly transposing the matrix and running

/home/nfsslave2/cado/cado-nfs-20090528-r2167/build/cow/linalg/bwc/u128_bench snfs.small.T -impl bucket

gave exactly the same error as having u128_bench do the transposition.


However, if I run balance on the transposed matrix then the u128_bench works with a 4x1 decomposition. 2x2 fails with the same error message as mentioned before.

- 1x4, 2x2 decomposition: fails Assertion `(n+np)*2 == (size_t) (spc - sp0)'
- 4x1 decomposition: 21 iterations in 105s, 4.99/1, 19.22 ns/coeff


If I give inadequately many parameters to a threaded call to u128_bench, it seems to read off the end of argv and into env:

[code]
nfsslave2@cow:/scratch/fib1039/with-cado$ taskset 0f /home/nfsslave2/cado/cado-nfs-20090528-r2167/build/cow/linalg/bwc/u128_bench -impl bucket -nthreads 4 -- butterfly14T.h*
4 threads requested, but 1 files given on the command line.
Using implementation "bucket"
no cache file butterfly14T.h*-bucket.bin
T0 Building cache file for butterfly14T.h*
no cache file (null)-bucket.bin
T1 Building cache file for (null)
no cache file TERM=xterm-color-bucket.bin
T2 Building cache file for TERM=xterm-color
no cache file SHELL=/bin/bash-bucket.bin
fopen(butterfly14T.h*): No such file or directory
fopen((null)): Bad address
fopen(TERM=xterm-color): No such file or directory
[/code]

joral 2009-06-04 15:40

Ok. apparently it is random. I've had it fail at iteration 100, 1000, 1900, and 19500 (out of 29300).

thome 2009-06-04 20:38

arg loop: fixed, thanks (this program is in fact unused -- does not really belong to the set of distributed prgs, yet it can be handy because it does a lot out of core).

running off argv -- this has been fixed in one of the updated tarballs that I had posted.

failing assert: the assert was wrong (sigh). Should have been (n+2*np)*2. A tiny example is the matrix which once piped through ``uniq -c'' gives the following output (one must also set HUGE_MPLEX_MIN to zero in matmul-bucket.cpp):
1 5000 5000
4365 0
1 1 1353
634 0

disappointing performance: I'm working on it.

Thanks,

E.

thome 2009-06-04 20:40

[quote=joral;175888]Ok. apparently it is random. I've had it fail at iteration 100, 1000, 1900, and 19500 (out of 29300).[/quote]

ok perhaps you could give a try on a different machine ?

The good thing is that if you've got a dimm stick at fault, then now you have a handy way to pinpoint the culprit ;-).

E.

frmky 2009-06-04 21:25

[QUOTE=thome;175728]This warning no longer appears (yes, there's a new tarball).[/QUOTE]

I tried compiling the new source in Linux x86_64 using pthreads, but it ends with the error

CMake Error in linalg/bwc/CMakeLists.txt:
Cannot find source file "matmul-sub-large-fbi.S".

Sure enough, this file is referenced in the CMakeLists.txt and a corresponding .h file is #include'd in matmul-bucket.cpp, but it's not in the directory.

joral 2009-06-05 00:30

Haven't been able to get it to build on my dual P3-700 yet. Don't want to compare to an athlon64x2 running at about 2Ghz.

I may pull out my ubuntu cd later and run memtest against it to see what happens.

I do find it interesting the examples run without incident.

joral 2009-06-05 12:37

Ok... Either I take my computer back to 1GB or I go buy a new memory stick. Ran memtest overnight, and it picked up about 808 bit errors right around the 1.5GB mark in 6 passes through.

Good call, though bad for me...

henryzz 2009-06-05 17:21

[quote=joral;176069]Ok... Either I take my computer back to 1GB or I go buy a new memory stick. Ran memtest overnight, and it picked up about 808 bit errors right around the 1.5GB mark in 6 passes through.

Good call, though bad for me...[/quote]
worthwhile knowing though

joral 2009-06-05 17:40

Makes me wonder how long this has been going on.

Jason, maybe you can comment on this. Is there a good chance that a single bit error of this type will lead to 'submatrix not invertible' errors trying to run the msieve block lanczos code?

jasonp 2009-06-05 17:56

[QUOTE=joral;176108]
Jason, maybe you can comment on this. Is there a good chance that a single bit error of this type will lead to 'submatrix not invertible' errors trying to run the msieve block lanczos code?[/QUOTE]
Serge has run into bad memory causing these errors, but they only seem to appear for really big jobs. I don't know if a single bit getting flipped is enough to ruin the entire run (instead of only ruining one dependency), but my guess is that a big linear algebra run pushes the bus really hard and causes memory access with marginal timing to behave incorrectly. The more efficient the code the harder the bus gets pushed, so this says nice things about the level of optimization in the CADO code :)

joral 2009-06-05 18:24

Hrmm. I wonder if my MB is one which allows me to control RAM timing. Try stepping it down a touch.

It wasn't just a single bit, but they also weren't evenly spaced.

All of them were 'expected FBFFFFFFFFFFFFFF Got FFFFFFFFFFFFFFFF'

fivemack 2009-06-06 12:11

Wow, much better performance
 
I don't know what you did, but it's now twice as fast single-threaded, and faster multi-threaded: I'm now getting

[code]
/home/nfsslave2/cado/cado-nfs-20090605-r2202/build/cow/linalg/bwc/u128_bench -t -impl bucket snfs.small

40 iters in 102s, 2.55/1, 9.81 ns/c (last 10 : 2.48/1, 9.57 ns/c)
[/code]

and, running

[code]
for u in 1x2 2x1 1x4 2x2 4x1 1x8 2x4 4x2 8x1; do /home/nfsslave2/cado/cado-nfs-20090605-r2202/build/cow/linalg/balance --in snfs.small.T --out slice$u --nslices $u --ramlimit 8G; done

for u in 1x2 2x1; do taskset 03 /home/nfsslave2/cado/cado-nfs-20090605-r2202/build/cow/linalg/bwc/u128_bench --impl bucket -nthreads 2 -- slice$u.[hv]*; done

for u in 1x4 2x2 4x1; do taskset 0f /home/nfsslave2/cado/cado-nfs-20090605-r2202/build/cow/linalg/bwc/u128_bench --impl bucket -nthreads 4 -- slice$u.[hv]*; done

for u in 1x8 2x4 4x2 8x1; do taskset ff /home/nfsslave2/cado/cado-nfs-20090605-r2202/build/cow/linalg/bwc/u128_bench --impl bucket -nthreads 8 -- slice$u.[hv]* 2>&1 | tee $u.b; done

[/code]

the timings are (4 threads distributed over 4 physical cores for 4-thread totals, 8 threads over 4 physical cores for 8-thread totals)

[code]
1x2 38 iters in 103s, 2.70/1, 10.40 ns/c (last 10 : 2.63/1, 10.13 ns/c)
2x1 35 iters in 100s, 2.86/1, 11.03 ns/c (last 10 : 2.78/1, 10.72 ns/c)

1x4 23 iters in 102s, 4.45/1, 17.14 ns/c (last 10 : 4.26/1, 16.42 ns/c)
2x2 23 iters in 104s, 4.52/1, 17.41 ns/c (last 10 : 4.33/1, 16.68 ns/c)
4x1 21 iters in 104s, 4.96/1, 19.11 ns/c (last 10 : 4.73/1, 18.24 ns/c)

1x8 10 iters in 108s, 10.84/1, 41.77 ns/c (last 10 : 9.86/1, 37.98 ns/c)
2x4 10 iters in 105s, 10.47/1, 40.36 ns/c (last 10 : 9.52/1, 36.69 ns/c)
4x2 10 iters in 110s, 10.96/1, 42.23 ns/c (last 10 : 9.96/1, 38.39 ns/c)
8x1 8 iters in 102s, 12.69/1, 48.91 ns/c (last 10 : 50.32/1, 193.89 ns/c)
[/code]

joral 2009-06-06 20:52

So I am now constrained to 1 GB RAM until I can find a reasonable price for DDR DIMMs. But it now works. At least one of the 1GB DIMMS I had were unreliable.

joral 2009-06-06 22:25

Finally made it to the sqrt phase, but I'm getting this one now when running allsqrt.

Odd valuation! At rational prime 731261

I'm almost tempted to completely trash this run and restart it with known good RAM. It's a 167 digit SNFS from the near repdigit lists. That should still fit in 1GB.

frmky 2009-06-08 03:36

Following Tom's lead, here are the benchmarks using Tom's matrix on the 8-cpu quadcore 2GHz Opteron Barcelona machine. There is another program running, so I only had 16 cores to work with.

[CODE]u64_bench:
16 iters in 106s, 6.61/1, 25.47 ns/c (last 10 : 6.24/1, 24.04 ns/c)

u128_bench:
7 iters in 101s, 14.47/1, 55.76 ns/c (last 10 : 54.58/1, 210.32 ns/c)

Using u64:

1x2: 10 iters in 106s, 10.63/1, 40.97 ns/c (last 10 : 9.65/1, 37.20 ns/c)
2x1: 10 iters in 101s, 10.12/1, 38.98 ns/c (last 10 : 9.21/1, 35.50 ns/c)

1x4: 6 iters in 101s, 16.89/1, 65.08 ns/c (last 10 : 61.01/1, 235.09 ns/c)
2x2: 7 iters in 112s, 16.02/1, 61.72 ns/c (last 10 : 60.34/1, 232.50 ns/c)
4x1: 7 iters in 104s, 14.90/1, 57.41 ns/c (last 10 : 57.48/1, 221.47 ns/c)

1x8: 7 iters in 103s, 14.78/1, 56.95 ns/c (last 10 : 60.20/1, 231.98 ns/c)
2x4: 6 iters in 106s, 17.61/1, 67.87 ns/c (last 10 : 58.28/1, 224.57 ns/c)
4x2: 6 iters in 105s, 17.44/1, 67.21 ns/c (last 10 : 59.67/1, 229.94 ns/c)
8x1: 6 iters in 106s, 17.69/1, 68.16 ns/c (last 10 : 74.09/1, 285.51 ns/c)

2x8: 3 iters in 111s, 36.84/1, 141.97 ns/c (last 10 : 80.81/1, 311.40 ns/c)
4x4: 3 iters in 103s, 34.24/1, 131.95 ns/c (last 10 : 75.74/1, 291.83 ns/c)
8x2: 4 iters in 115s, 28.82/1, 111.03 ns/c (last 10 : 85.86/1, 330.84 ns/c)


Using u128:

1x2: 4 iters in 107s, 26.70/1, 102.90 ns/c (last 10 : 67.94/1, 261.80 ns/c)
2x1: 4 iters in 113s, 28.20/1, 108.68 ns/c (last 10 : 67.73/1, 260.98 ns/c)

1x4: 3 iters in 110s, 36.54/1, 140.79 ns/c (last 10 : 78.34/1, 301.85 ns/c)
2x2: 3 iters in 116s, 38.65/1, 148.93 ns/c (last 10 : 76.51/1, 294.82 ns/c)
4x1: 4 iters in 113s, 28.29/1, 109.02 ns/c (last 10 : 75.04/1, 289.15 ns/c)

1x8: 3 iters in 125s, 41.63/1, 160.41 ns/c (last 10 : 88.00/1, 339.08 ns/c)
2x4: 3 iters in 118s, 39.45/1, 152.00 ns/c (last 10 : 77.86/1, 300.03 ns/c)
4x2: 3 iters in 118s, 39.18/1, 150.96 ns/c (last 10 : 77.40/1, 298.25 ns/c)
8x1: 3 iters in 109s, 36.48/1, 140.56 ns/c (last 10 : 84.80/1, 326.74 ns/c)

2x8: 2 iters in 123s, 61.31/1, 236.26 ns/c (last 10 : 111.50/1, 429.66 ns/c)
4x4: 2 iters in 118s, 59.12/1, 227.79 ns/c (last 10 : 99.00/1, 381.47 ns/c)
8x2: 2 iters in 118s, 58.88/1, 226.90 ns/c (last 10 : 102.75/1, 395.91 ns/c)

[/CODE]

As you can see, the times that I am seeing are significantly slower than those on Tom's i7. Also, I'm getting decent scaling up to 8 threads. The jump from 8 to 16 threads, however, doesn't scale as well.

In other news, before today I haven't been able to get a successful factorization using relations from the GGNFS sievers. Today, I rewrote verify.c to output all factors of the norms, including those below 1000 and multiplicity, and even sorted them for good measure. This worked! I'm now going to see what I can remove (starting with sorting) and still get a valid factorization.

frmky 2009-06-08 06:48

Encouraged by the successful run today, I thought I might try it on the cluster using MPI. But, no. This is an example that ran fine with MPI off.

[CODE]
#############################################################################
../bin/linalg/bwc/./lingen nullspace=left wdir=bwc mm_impl=bucket thr=1x1 interval=100 mpi=4x4 seed=1 mn=64 interleaving=0 --lingen-threshold 64
# (exported) ../bin/linalg/bwc/./lingen nullspace=left wdir=bwc mm_impl=bucket thr=1x1 interval=100 mpi=4x4 seed=1 mn=64 interleaving=0 --lingen-threshold 64
# Compiled with gcc 4.3.2
# Compilation flags -O3 -funroll-loops -DNDEBUG -std=c99 -g -W -Wall
Reading scalar data in polynomial ``a''
Using A(X) div X in order to consider Y as starting point
../bin/linalg/bwc/./lingen: died with signal 11, without coredump
[/CODE]

thome 2009-06-08 21:57

[quote=frmky;176489]Encouraged by the successful run today, I thought I might try it on the cluster using MPI. But, no. This is an example that ran fine with MPI off.

[code]
[...]
../bin/linalg/bwc/./lingen: died with signal 11, without coredump
[/code][/quote]

Strange.

Could you please try to recompile with -O0 -g and provide a backtrace ? To do so you need to create/modify a file named local.sh at the root of the cado tree, to contain:
CFLAGS="-O0 -g"
CXXFLAGS="-O0 -g"

(the -O0 is there because some mpi versions have a tendency to put -O2 no matter what).

Then do ``make cmake'', then ``make -j8''

Then just gdb --args <complete lingen command line>, and wait until it catches the signal. Type ``bt'' at the gdb prompt.

I'm also interested by the ls -l output of your directory. You can send it by e-mail if you prefer.

E.

frmky 2009-06-10 02:54

Perhaps somewhat annoyingly, it got further when compiled with "-O0 -g":

[CODE]../bin/linalg/bwc/./lingen nullspace=left wdir=bwc mm_impl=bucket thr=1x1 interval=100 mpi=4x4 seed=1 mn=64 interleaving=0 --lingen-threshold 64


al=100 mpi=4x4 seed=1 mn=64 interleaving=0 --lingen-threshold 64
# (exported) ../bin/linalg/bwc/./lingen nullspace=left wdir=bwc mm_impl=bucket thr=1x1 interval=100 mpi=4x4 seed=1 mn=64 interleaving=0 --lingen-threshold 64
# Compiled with gcc 4.3.2
# Compilation flags -g -O0 -std=c99 -g -W -Wall
Reading scalar data in polynomial ``a''
Using A(X) div X in order to consider Y as starting point
Computing t0
[X^0] A, col 63 increases rank to 64 (head row 62)
Found satisfying init data for t0=1
written F_INIT_QUICK to disk
t0 = 1
Computing value of E(X)=A(X)F(X) (degree 7098) [ +O(X^7099) ]
Throwing out a(X)
E: 7098 coeffs, t=1
57 [7]: 5.8/748.6 [7+]: 5.8/748.6 (1%)
112 [7]: 11.5/737.6 [7+]: 11.5/737.6 (2%)
112 [6,6+]: 0.7/42.2,779.8 [6+]: 12.2/779.8 (2%)
168 [7]: 17.4/740.7 [6+]: 18.0/783.0 (2%)
223 [7]: 23.0/737.1 [6+]: 23.7/779.3 (3%)
223 [6,6+]: 1.3/42.2,779.3 [6+]: 24.4/779.3 (3%)
223 [5,5+]: 1.4/43.7,823.0 [5+]: 25.7/823.0 (3%)
279 [7]: 28.9/739.5 [5+]: 31.6/825.4 (4%)

... editing out a bunch of lines ...

6323 [7]: 656.4/737.0 [1+]: 905.7/1065.0 (85%)
6323 [6,6+]: 37.8/42.5,779.5 [1+]: 906.4/1065.0 (85%)
6379 [7]: 662.2/737.1 [1+]: 912.2/1065.1 (86%)
6434 [7]: 667.9/737.0 [1+]: 917.9/1065.0 (86%)
6434 [6,6+]: 38.5/42.5,779.5 [1+]: 918.5/1065.0 (86%)
6434 [5,5+]: 39.2/43.2,822.7 [1+]: 919.9/1065.0 (86%)
6490 [7]: 673.7/737.0 [1+]: 925.7/1065.0 (87%)
6545 [7]: 679.4/736.9 [1+]: 931.4/1064.9 (87%)
6545 [6,6+]: 39.2/42.5,779.4 [1+]: 932.0/1064.9 (88%)
6601 [7]: 685.2/737.0 [1+]: 937.9/1065.0 (88%)
6656 [7]: 690.9/736.9 [1+]: 943.5/1064.9 (89%)
6656 [6,6+]: 39.8/42.5,779.4 [1+]: 944.2/1064.9 (89%)
6656 [5,5+]: 40.5/43.2,822.6 [1+]: 945.5/1064.9 (89%)
6656 [4,4+]: 29.7/31.7,854.3 [1+]: 947.5/1064.9 (89%)
6712 [7]: 696.7/737.0 [1+]: 953.4/1065.0 (90%)
6767 [7]: 702.4/736.9 [1+]: 959.0/1064.9 (90%)
6767 [6,6+]: 40.5/42.5,779.4 [1+]: 959.7/1064.9 (90%)
6823 [7]: 708.2/737.0 [1+]: 965.5/1065.0 (91%)
6878 [7]: 713.9/736.9 [1+]: 971.2/1064.9 (91%)
6878 [6,6+]: 41.2/42.5,779.4 [1+]: 971.9/1064.9 (91%)
6878 [5,5+]: 41.9/43.2,822.6 [1+]: 973.2/1064.9 (91%)
6934 [7]: 719.7/737.0 [1+]: 979.0/1065.0 (92%)
6985 6cols=0: [0..5]
6986 63cols=0: [0..62] [0..5]*2
6987 64cols=0: [0..63] [0..62]*2 [0..5]*3
6988 64cols=0: [0..63]*2 [0..62]*3 [0..5]*4
6989 [7]: 725.4/736.9 [1+]: 984.7/1064.8 (92%)
6989 [6,6+]: 41.8/42.5,779.3 [1+]: 985.3/1064.8 (93%)
6989 64cols=0: [0..63]*3 [0..62]*4 [0..5]*5
6990 64cols=0: [0..63]*4 [0..62]*5 [0..5]*6
6991 64cols=0: [0..63]*5 [0..62]*6 [0..5]*7
6992 64cols=0: [0..63]*6 [0..62]*7 [0..5]*8
6993 64cols=0: [0..63]*7 [0..62]*8 [0..5]*9
6994 64cols=0: [0..63]*8 [0..62]*9 [0..5]*10
6995 64cols=0: [0..63]*9 [0..62]*10 [0..5]*11
6996 64cols=0: [0..63]*10 [0..62]*11 [0..5]*12
6997 64cols=0: [0..63]*11 [0..62]*12 [0..5]*13
6998 64cols=0: [0..63]*12 [0..62]*13 [0..5]*14
6999 64cols=0: [0..63]*13 [0..62]*14 [0..5]*15
7000 64cols=0: [0..63]*14 [0..62]*15 [0..5]*16
7001 64cols=0: [0..63]*15 [0..62]*16 [0..5]*17
7002 64cols=0: [0..63]*16 [0..62]*17 [0..5]*18
7003 64cols=0: [0..63]*17 [0..62]*18 [0..5]*19
7004 64cols=0: [0..63]*18 [0..62]*19 [0..5]*20
7005 64cols=0: [0..63]*19 [0..62]*20 [0..5]*21
7006 64cols=0: [0..63]*20 [0..62]*21 [0..5]*22
7007 64cols=0: [0..63]*21 [0..62]*22 [0..5]*23
7008 64cols=0: [0..63]*22 [0..62]*23 [0..5]*24
7009 64cols=0: [0..63]*23 [0..62]*24 [0..5]*25
7010 64cols=0: [0..63]*24 [0..62]*25 [0..5]*26
7011 64cols=0: [0..63]*25 [0..62]*26 [0..5]*27
7012 64cols=0: [0..63]*26 [0..62]*27 [0..5]*28
7013 64cols=0: [0..63]*27 [0..62]*28 [0..5]*29
7014 64cols=0: [0..63]*28 [0..62]*29 [0..5]*30
7015 64cols=0: [0..63]*29 [0..62]*30 [0..5]*31
7016 64cols=0: [0..63]*30 [0..62]*31 [0..5]*32
7017 64cols=0: [0..63]*31 [0..62]*32 [0..5]*33
7018 64cols=0: [0..63]*32 [0..62]*33 [0..5]*34
7019 64cols=0: [0..63]*33 [0..62]*34 [0..5]*35
7020 64cols=0: [0..63]*34 [0..62]*35 [0..5]*36
7021 64cols=0: [0..63]*35 [0..62]*36 [0..5]*37
7022 64cols=0: [0..63]*36 [0..62]*37 [0..5]*38
7023 64cols=0: [0..63]*37 [0..62]*38 [0..5]*39
7024 64cols=0: [0..63]*38 [0..62]*39 [0..5]*40
7025 64cols=0: [0..63]*39 [0..62]*40 [0..5]*41
../bin/linalg/bwc/./lingen: died with signal 11, without coredump[/CODE]

I'm running it with gdb now...

frmky 2009-06-13 06:16

After the successful small GNFS run, I tried a larger SNFS run with the same binaries, but no luck:

[CODE]Multiply ker and character matrix
64 rows done
Computing tiny kernel
dim of ker = 63
Sorry 0-th vector is 0
Sorry 1-th vector is 0
Sorry 2-th vector is 0
Sorry 3-th vector is 0
Sorry 4-th vector is 0
Sorry 5-th vector is 0
Sorry 6-th vector is 0
Sorry 7-th vector is 0
Sorry 8-th vector is 0
Sorry 9-th vector is 0
Sorry 10-th vector is 0
Sorry 11-th vector is 0
Sorry 12-th vector is 0
Sorry 13-th vector is 0
Sorry 14-th vector is 0
Sorry 15-th vector is 0
Sorry 16-th vector is 0
Sorry 17-th vector is 0
Sorry 18-th vector is 0
Sorry 19-th vector is 0
Sorry 20-th vector is 0
Sorry 21-th vector is 0
Sorry 22-th vector is 0
Sorry 23-th vector is 0
Sorry 24-th vector is 0
Sorry 25-th vector is 0
Sorry 26-th vector is 0
Sorry 27-th vector is 0
Sorry 28-th vector is 0
Sorry 29-th vector is 0
Sorry 30-th vector is 0
Sorry 31-th vector is 0
Sorry 32-th vector is 0
Sorry 33-th vector is 0
Sorry 34-th vector is 0
Sorry 35-th vector is 0
Sorry 36-th vector is 0
Sorry 37-th vector is 0
Sorry 38-th vector is 0
Sorry 39-th vector is 0
Sorry 40-th vector is 0
Sorry 41-th vector is 0
Sorry 42-th vector is 0
Sorry 43-th vector is 0
Sorry 44-th vector is 0
Sorry 45-th vector is 0
Sorry 46-th vector is 0
Sorry 47-th vector is 0
Sorry 48-th vector is 0
Sorry 49-th vector is 0
Sorry 50-th vector is 0
Sorry 51-th vector is 0
Sorry 52-th vector is 0
Sorry 53-th vector is 0
Sorry 54-th vector is 0
Sorry 55-th vector is 0
Sorry 56-th vector is 0
Sorry 57-th vector is 0
Sorry 58-th vector is 0
Sorry 59-th vector is 0
Sorry 60-th vector is 0
Sorry 61-th vector is 0
Sorry 62-th vector is 0
[/CODE]

alex_148 2009-07-11 22:33

Hello! I have some errors while compile under cygwin: it's like that

cantor/mpfq_2_128.h.32:791:error:invalid operands of types 'long long int _vector_' and 'long long int _vector_' to binary 'operator^'.
[LIST][*][/LIST] [LIST] [/LIST]

jasonp 2009-07-27 19:15

Congratulations to the CADO group for making [url="http://gforge.inria.fr/plugins/scmsvn/viewcvs.php/trunk/?root=cado-nfs"]their repository[/url] available!

henryzz 2010-07-17 11:15

[code]:~/Desktop/cado-nfs/cado-nfs/trunk$ ./run_example.sh
Testing factorization as given by ./params/params.c59 in /tmp/cado.xaXZdQgYzS
./cadofactor.pl cadodir=/home/david/Desktop/cado-nfs/cado-nfs/trunk/build/Jimmy-Ubuntu /tmp/cado.xaXZdQgYzS/param machines=/tmp/cado.xaXZdQgYzS/mach_desc wdir=/tmp/cado.xaXZdQgYzS delay=60 sievenice=0 selectnice=0 logfile=/tmp/cado.xaXZdQgYzS/out
Info:--------------------------------------------------------------------------
Info:Initialization
Info:--------------------------------------------------------------------------
Info:Reading the parameters...
Info:Reading the machine description file...
Info:Initializing the working directory...
Info:--------------------------------------------------------------------------
Info:Polynomial selection
Info:--------------------------------------------------------------------------
Info:No job status file found. Creating empty one.
Info:Starting new jobs...
Info:Sending `c59.n' to `localhost'...
Info:Starting job: c59 localhost 1 100001
Info:Starting job: c59 localhost 100001 2e5
Info:Total interval coverage: 0 %.
Info:Waiting for 60 seconds before checking again...
Info:Checking all running jobs...
Info:Checking job: c59 localhost 1 100001
Info:Running...
Info:Checking job: c59 localhost 100001 2e5
Info:Running...
Info:Starting new jobs...
Info:Total interval coverage: 0 %.
Info:Waiting for 60 seconds before checking again...
Info:Checking all running jobs...
Info:Checking job: c59 localhost 1 100001
Info:Finished!
Info:Checking job: c59 localhost 100001 2e5
Info:Finished!
Info:Retrieving job data...
Info:Retrieving `c59.kjout.1-100001' from `localhost'...
Info:Retrieving `c59.kjout.100001-2e5' from `localhost'...
Info:Starting new jobs...
Info:Total interval coverage: 100 %.
Info:Cleaning up...
Info:All done!
Info:The best polynomial is from `c59.kjout.1-100001' (E = 17.32).
Info:Generating factor base...
Info:Computing free relations...
Info:--------------------------------------------------------------------------
Info:Sieve
Info:--------------------------------------------------------------------------
Info:Checking previous files...
Info:Imported 750 relations from `c59.freerels.gz'.
Info:No job status file found. Creating empty one.
Info:Starting new jobs...
Info:Sending `c59.poly' to `localhost'...
Info:Sending `c59.roots' to `localhost'...
Info:Starting job: c59 localhost 400000 405000
Info:Running total: 750 relations.
Info:Waiting for 60 seconds before checking again...
Info:Checking all running jobs...
Info:Checking job: c59 localhost 400000 405000
Info:Running...
Info:Starting new jobs...
Info:Running total: 750 relations.
Info:Waiting for 60 seconds before checking again...
Info:Checking all running jobs...
Info:Checking job: c59 localhost 400000 405000
Info:Finished!
Info:Retrieving job data...
Info:Retrieving `c59.rels.400000-405000.gz' from `localhost'...
Info:Imported 287703 relations from `c59.rels.400000-405000.gz'.
Info:Starting new jobs...
Info:Starting job: c59 localhost 405000 410000
Info:Running total: 288453 relations.
Info:--------------------------------------------------------------------------
Info:Duplicate and singleton removal
Info:--------------------------------------------------------------------------
Info:Removing duplicates...
Info:split new files in 4 slices...
Info:removing duplicates on slice 0...
Info:removing duplicates on slice 1...
Info:removing duplicates on slice 2...
Info:removing duplicates on slice 3...
Info:Number of relations left: 288207.
Info:Removing singletons...
Info:Not enough relations! Continuing sieving...
Info:Waiting for 60 seconds before checking again...
Info:Checking all running jobs...
Info:Checking job: c59 localhost 405000 410000
Info:Finished!
Info:Retrieving job data...
Info:Retrieving `c59.rels.405000-410000.gz' from `localhost'...
Info:Imported 283213 relations from `c59.rels.405000-410000.gz'.
Info:Starting new jobs...
Info:Starting job: c59 localhost 410000 415000
Info:Running total: 571666 relations.
Info:--------------------------------------------------------------------------
Info:Duplicate and singleton removal
Info:--------------------------------------------------------------------------
Info:Removing duplicates...
Info:split new files in 4 slices...
Info:removing duplicates on slice 0...
Info:removing duplicates on slice 1...
Info:removing duplicates on slice 2...
Info:removing duplicates on slice 3...
Info:Number of relations left: 570679.
Info:Removing singletons...
Info:Nrows: 28740; Ncols: 28580; Excess: 160.
Info:Join all no duplicates files into one file...
Info:clean directory nodup...
Info:Cleaning up...
Info:Killing job: c59 localhost 410000 415000
Info:Truncating `c59.rels.410000-415000' to range 410000-412860...
Info:Imported 176451 relations from `c59.rels.410000-412860.gz'.
Info:All done!
Info:--------------------------------------------------------------------------
Info:Merge
Info:--------------------------------------------------------------------------
Info:Merging relations...
Info:Minimal bwcost: 6262176018.
Info:Replaying merge history...
Info:Nrows: 9875; Ncols: 9715; Weight: 969672.
Info:--------------------------------------------------------------------------
Info:Linear algebra
Info:--------------------------------------------------------------------------
Warning:Parameter 'skip' currently unhandled by bwc code
Info:Calling Block-Wiedemann (new code)...
Error:Command `/home/david/Desktop/cado-nfs/cado-nfs/trunk/build/Jimmy-Ubuntu/linalg/bwc/bwc.pl :complete seed=1 thr=2x2 mpi=1x1 matrix=/tmp/cado.xaXZdQgYzS/c59.small nullspace=left mm_impl=bucket interleaving=0 interval=100 mode=u64 mn=64 splits=0,64 ys=0..64 wdir=/tmp/cado.xaXZdQgYzS/c59.bwc bwc_bindir=/home/david/Desktop/cado-nfs/cado-nfs/trunk/build/Jimmy-Ubuntu/linalg/bwc >> /tmp/cado.xaXZdQgYzS/c59.bwc.stderr 2>&1' terminated unexpectedly with exit status 1.
FAILED ; data left in /tmp/cado.xaXZdQgYzS
[/code]anyone know how to fix this error?

BWetter246 2012-06-03 21:13

When i try to compile cado-nfs from the repository, i get an error at ./utils/cachesize_cpuid.c:47:3. Inconsistent operand constraints in an 'asm'. Does anyone know what this error means or how to fix this? I am using gcc 4.6.3 (Ubuntu/Linaro 4.6.3).

BWetter246 2012-06-03 23:17

ok, so i pretty much changed the cpuid function in cachesize_cpuid.c and it compiled.

[code]
#include <cpuid.h>
void cpuid(uint32_t res[4], uint32_t op) {
__get_cpuid(op, &res[0], &res[1], &res[2], &res[3]);
}
[/code]

jasonp 2012-06-04 02:04

Looking at the original, it appears that eax is being referenced by name as an input and an output to the asm, which older versions of gcc inline asm syntax allowed but current versions do not. I'll forward the report to Paul's group; it's strange they haven't seen it already.

BWetter246 2012-06-04 16:17

how about this

[code]
void cpuid(uint32_t res[4], uint32_t op) {
#ifdef __GNUC__
__asm__ volatile( "pushl %%ebx \n\t"
"cpuid \n\t"
"movl %%ebx, %1\n\t"
"popl %%ebx \n\t"
: "=a"(res[0]), "=r"(res[1]), "=c"(res[2]), "=d"(res[3])
: "a"(op)
: "cc" );
#else
#error "Please teach your compiler how to call cpuid"
#endif
}
[/code]

jasonp 2012-06-05 11:09

Paul writes:
[quote]
since all CADO-NFS developers are not on mersenneforum, in general it is better to advice people to report problems on [url="https://gforge.inria.fr/tracker/?atid=7442&group_id=2065&func=browse"]the CADO-NFS bug tracker[/url]
[/quote]
Regarding your CPUID substitute, the problem I see is that you have a "=a" in the output list and an "a" in the input list. While the syntax does let you reuse a specific register, I didn't think mentioing it by name twice was allowed. Msieve does something similar:
[code]
#define CPUID(code, a, b, c, d) \
ASM_G volatile( \
"movl %%ebx, %%esi \n\t" \
"cpuid \n\t" \
"movl %%ebx, %1 \n\t" \
"movl %%esi, %%ebx \n\t" \
:"=a"(a), "=m"(b), "=c"(c), "=d"(d) \
:"0"(code) : "%esi")
[/code]

bai 2012-06-05 14:10

[QUOTE=BWetter246;301169]When i try to compile cado-nfs from the repository, i get an error at ./utils/cachesize_cpuid.c:47:3. Inconsistent operand constraints in an 'asm'. Does anyone know what this error means or how to fix this? I am using gcc 4.6.3 (Ubuntu/Linaro 4.6.3).[/QUOTE]

BWetter&jason, thanks for pointing the bug and fix. As jason's post, does it work if we change the input line from
[QUOTE]: "a" (op)[/QUOTE]
to
[QUOTE]: "0" (op)[/QUOTE]

Btw: I just tested it (without above change) on a Ubuntu 12.04 machine (Ubuntu/Linaro 4.6.3) and it compiles fine. What flag do you use?

BWetter246 2012-06-05 20:43

I used make in bash and it generated -g -W -Wall -O2.

bai 2012-06-06 01:39

[URL="http://www.mersenneforum.org/member.php?u=1959"]BWetter246[/URL], I can't re-trigger the error.

Here is the configuration (output by make),

[CODE]Configuring gf2x with options --disable-shared --disable-dependency-tracking CFLAGS=-std=c99 -g -W -Wall -O2 -I/usr/include CXXFLAGS=-g -W -Wall -O2 -I/usr/include
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
checking target system type... x86_64-unknown-linux-gnu
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... no
checking for mawk... mawk
checking whether make sets $(MAKE)... yes
checking how to print strings... printf
checking for style of include used by make... GNU
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking dependency style of gcc... none
checking for a sed that does not truncate output... /bin/sed
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for fgrep... /bin/grep -F
checking for ld used by gcc... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... yes
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking whether ln -s works... yes
checking the maximum length of command line arguments... 3458764513820540925
checking whether the shell understands some XSI constructs... yes
checking whether the shell understands "+="... yes
checking how to convert x86_64-unknown-linux-gnu file names to x86_64-unknown-linux-gnu format... func_convert_file_noop
checking how to convert x86_64-unknown-linux-gnu file names to toolchain format... func_convert_file_noop
checking for /usr/bin/ld option to reload object files... -r
checking for objdump... objdump
checking how to recognize dependent libraries... pass_all
checking for dlltool... no
checking how to associate runtime and link libraries... printf %s\n
checking for ar... ar
checking for archiver @FILE support... @
checking for strip... strip
checking for ranlib... ranlib
checking command to parse /usr/bin/nm -B output from gcc object... ok
checking for sysroot... no
checking for mt... mt
checking if mt is a manifest tool... no
checking how to run the C preprocessor... gcc -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for dlfcn.h... yes
checking for objdir... .libs
checking if gcc supports -fno-rtti -fno-exceptions... no
checking for gcc option to produce PIC... -fPIC -DPIC
checking if gcc PIC flag -fPIC -DPIC works... yes
checking if gcc static flag -static works... yes
checking if gcc supports -c -o file.o... yes
checking if gcc supports -c -o file.o... (cached) yes
checking whether the gcc linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking dynamic linker characteristics... GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking whether stripping libraries is possible... yes
checking if libtool supports shared libraries... yes
checking whether to build shared libraries... no
checking whether to build static libraries... yes
checking for gcc... (cached) gcc
checking whether we are using the GNU C compiler... (cached) yes
checking whether gcc accepts -g... (cached) yes
checking for gcc option to accept ISO C89... (cached) none needed
checking dependency style of gcc... (cached) none
checking for g++... no
checking for c++... no
checking for gpp... no
checking for aCC... no
checking for CC... no
checking for cxx... no
checking for cc++... no
checking for cl.exe... no
checking for FCC... no
checking for KCC... no
checking for RCC... no
checking for xlC_r... no
checking for xlC... no
checking whether we are using the GNU C++ compiler... no
checking whether g++ accepts -g... no
checking dependency style of g++... none
checking warning verbosity option... for C++ -Wall -W for C
checking for gcc option to accept ISO C99... none needed
checking build system compiler gcc... yes
checking for build system executable suffix...
checking whether gcc and cc understand -c and -o together... yes
checking size of unsigned long... 8
checking whether gcc can compile sse-2 code... yes
checking whether gcc can compile pclmulqdq and if it is supported by the hardware... no
checking the number of bits in an unsigned long... 64
configure: using ABI="default"
configure: CC="gcc"
configure: CFLAGS="-Wall -W -std=c99 -g -W -Wall -O2 -I/usr/include"
configure: CPPFLAGS=""
configure: hwdir="x86_64"
checking whether already_tuned/x86_64/ is right assuming 64-bits unsigned longs... yes
configure: creating ./config.status
[/CODE]and gcc -v

[CODE]gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.6.3-1ubuntu5' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i686 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
[/CODE]

Are there any obvious differences to yours?

BWetter246 2012-06-11 06:07

I managed to compile the software package, but it seems to hang when I try to use the example in the readme. './factor.sh 90377629292003121684002147101760858109247336549001090677693'

[code]
Info:--------------------------------------------------------------------------
Info:Initialization
Info:--------------------------------------------------------------------------
Info:Reading the parameters...
Info:Initializing the working directory...
Info:--------------------------------------------------------------------------
Info:Polynomial selection
Info:--------------------------------------------------------------------------
Info:Total interval coverage: 0 %.
Info:Starting job: 0 5000
[/code]

BWetter246 2012-06-11 06:09

1 Attachment(s)
gcc version
[code]
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-linux-gnu/4.6/lto-wrapper
Target: i686-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.6.3-1ubuntu5' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --enable-objc-gc --enable-targets=all --disable-werror --with-arch-32=i686 --with-tune=generic --enable-checking=release --build=i686-linux-gnu --host=i686-linux-gnu --target=i686-linux-gnu
Thread model: posix
gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
[/code]

jasonp 2012-06-11 11:46

CADO-NFS has several mailing lists; the developers all read them, so you would get help more quickly if you used them. [url="https://gforge.inria.fr/mail/?group_id=2065"]link[/url]. The project pages also have a bug tracker you can use

bai 2012-06-13 00:21

[QUOTE=BWetter246;301997]I managed to compile the software package, but it seems to hang when I try to use the example in the readme. './factor.sh 90377629292003121684002147101760858109247336549001090677693'
[/QUOTE]

Could you please try revision on/after 0753f50 in git?

Btw: just curious - what's the cpu? (considering "i686-pc-linux-gnu" & size of unsigned long... 4)

BWetter246 2012-06-13 03:47

the processor is an Intel P4. The reason why the script is hanging is because I believe the number is too small. When I try the RSA100 number, polyselect2l crashes.

[code]
Info:--------------------------------------------------------------------------
Info:Initialization
Info:--------------------------------------------------------------------------
Info:Reading the parameters...
Info:Initializing the working directory...
Info:--------------------------------------------------------------------------
Info:Polynomial selection
Info:--------------------------------------------------------------------------
Info:Total interval coverage: 0 %.
Info:Starting job: 0 10000
Error:Command `env nice -0 /home/bwetter246/Documents/cado-nfs/build/bwetter246-Linux/polyselect/polyselect2l -q -lq 3 -nq 1000 -incr 210 -admin 0 -admax 10000 -degree 4 -maxnorm 35.0 -t 1 50000 < /tmp/cado.X2RbEedTH0/c100.n > /tmp/cado.X2RbEedTH0/c100.polsel_out.0-10000 2>&1' terminated unexpectedly with exit status 134.
FAILED ; data left in /tmp/cado.X2RbEedTH0
[/code]

c100.polsel.out
[code]
# /home/bwetter246/Documents/cado-nfs/build/bwetter246-Linux/polyselect/polyselect2l -q -lq 3 -nq 1000 -incr 210 -admin 0 -admax 10000 -degree 4 -maxnorm 35.0 -t 1 50000
*** glibc detected *** /home/bwetter246/Documents/cado-nfs/build/bwetter246-Linux/polyselect/polyselect2l: free(): invalid pointer: 0xb6e090a0 ***
======= Backtrace: =========
/lib/i386-linux-gnu/libc.so.6(+0x73e42)[0x233e42]
/usr/lib/i386-linux-gnu/libgmp.so.10(__gmp_default_free+0x1b)[0x894d9b]
/usr/lib/i386-linux-gnu/libgmp.so.10(__gmpz_clear+0x2a)[0x89f70a]
/home/bwetter246/Documents/cado-nfs/build/bwetter246-Linux/polyselect/polyselect2l(comp_sq_roots+0x1ae)[0x805002e]
/home/bwetter246/Documents/cado-nfs/build/bwetter246-Linux/polyselect/polyselect2l[0x804d2ff]
/home/bwetter246/Documents/cado-nfs/build/bwetter246-Linux/polyselect/polyselect2l(one_thread+0x40a)[0x804daba]
/lib/i386-linux-gnu/libpthread.so.0(+0x6d4c)[0x631d4c]
/lib/i386-linux-gnu/libc.so.6(clone+0x5e)[0x2aaace]
======= Memory map: ========
00110000-0012c000 r-xp 00000000 08:01 916660 /lib/i386-linux-gnu/libgcc_s.so.1
0012c000-0012d000 r--p 0001b000 08:01 916660 /lib/i386-linux-gnu/libgcc_s.so.1
0012d000-0012e000 rw-p 0001c000 08:01 916660 /lib/i386-linux-gnu/libgcc_s.so.1
001bf000-001c0000 r-xp 00000000 00:00 0 [vdso]
001c0000-0035f000 r-xp 00000000 08:01 916639 /lib/i386-linux-gnu/libc-2.15.so
0035f000-00361000 r--p 0019f000 08:01 916639 /lib/i386-linux-gnu/libc-2.15.so
00361000-00362000 rw-p 001a1000 08:01 916639 /lib/i386-linux-gnu/libc-2.15.so
00362000-00365000 rw-p 00000000 00:00 0
0062b000-00642000 r-xp 00000000 08:01 916719 /lib/i386-linux-gnu/libpthread-2.15.so
00642000-00643000 r--p 00016000 08:01 916719 /lib/i386-linux-gnu/libpthread-2.15.so
00643000-00644000 rw-p 00017000 08:01 916719 /lib/i386-linux-gnu/libpthread-2.15.so
00644000-00646000 rw-p 00000000 00:00 0
0082f000-00859000 r-xp 00000000 08:01 916671 /lib/i386-linux-gnu/libm-2.15.so
00859000-0085a000 r--p 00029000 08:01 916671 /lib/i386-linux-gnu/libm-2.15.so
0085a000-0085b000 rw-p 0002a000 08:01 916671 /lib/i386-linux-gnu/libm-2.15.so
0088c000-00903000 r-xp 00000000 08:01 2097835 /usr/lib/i386-linux-gnu/libgmp.so.10.0.2
00903000-00904000 r--p 00076000 08:01 2097835 /usr/lib/i386-linux-gnu/libgmp.so.10.0.2
00904000-0090b000 rw-p 00077000 08:01 2097835 /usr/lib/i386-linux-gnu/libgmp.so.10.0.2
00934000-00942000 r-xp 00000000 08:01 1837184 /home/bwetter246/Documents/cado-nfs/build/bwetter246-Linux/polyselect/libpolyselect_common.so
00942000-00943000 r--p 0000d000 08:01 1837184 /home/bwetter246/Documents/cado-nfs/build/bwetter246-Linux/polyselect/libpolyselect_common.so
00943000-00944000 rw-p 0000e000 08:01 1837184 /home/bwetter246/Documents/cado-nfs/build/bwetter246-Linux/polyselect/libpolyselect_common.so
00d96000-00db6000 r-xp 00000000 08:01 916619 /lib/i386-linux-gnu/ld-2.15.so
00db6000-00db7000 r--p 0001f000 08:01 916619 /lib/i386-linux-gnu/ld-2.15.so
00db7000-00db8000 rw-p 00020000 08:01 916619 /lib/i386-linux-gnu/ld-2.15.so
00ecf000-00efb000 r-xp 00000000 08:01 1837119 /home/bwetter246/Documents/cado-nfs/build/bwetter246-Linux/utils/libutils.so
00efb000-00efc000 r--p 0002b000 08:01 1837119 /home/bwetter246/Documents/cado-nfs/build/bwetter246-Linux/utils/libutils.so
00efc000-00efd000 rw-p 0002c000 08:01 1837119 /home/bwetter246/Documents/cado-nfs/build/bwetter246-Linux/utils/libutils.so
08048000-08064000 r-xp 00000000 08:01 1837232 /home/bwetter246/Documents/cado-nfs/build/bwetter246-Linux/polyselect/polyselect2l
08064000-08065000 r--p 0001c000 08:01 1837232 /home/bwetter246/Documents/cado-nfs/build/bwetter246-Linux/polyselect/polyselect2l
08065000-08066000 rw-p 0001d000 08:01 1837232 /home/bwetter246/Documents/cado-nfs/build/bwetter246-Linux/polyselect/polyselect2l
084ca000-084eb000 rw-p 00000000 00:00 0 [heap]
b6e00000-b6e28000 rw-p 00000000 00:00 0
b6e28000-b6f00000 ---p 00000000 00:00 0
b6f1b000-b6f1c000 ---p 00000000 00:00 0
b6f1c000-b771e000 rw-p 00000000 00:00 0
b772b000-b7731000 rw-p 00000000 00:00 0
bfdd5000-bfdf6000 rw-p 00000000 00:00 0 [stack]
# Info: initializing 4459 P primes took 0ms, seed=1339558903, rawonly=0, nq=1000, target_time=10000
# Info: estimated peak memory=0.00MB (1 thread(s), batch 10 inversions on SQ)
Aborted (core dumped)
[/code]

bai 2012-06-18 07:03

[QUOTE=BWetter246;302147]the processor is an Intel P4. The reason why the script is hanging is because I believe the number is too small. When I try the RSA100 number, polyselect2l crashes.
[/QUOTE]

I think it is not the number size, but the 32bit portability of polyselect/*. I've updated some codes just now (git rev. dcabf90). They're working on a gentoo-3.3.0-atom 32-bit x86. Hopefully it also works on a P4.

BWetter246 2012-06-18 17:59

thanks bai, the new repository revision works.

skan 2012-12-08 14:05

Hello

Did anybody get to compile it for Windows?

Dubslow 2016-03-16 05:45

Seems that CADO has switched from Perl to Python without anyone here noticing.

xilman 2016-03-16 07:24

[QUOTE=Dubslow;429296]Seems that CADO has switched from Perl to Python without anyone here noticing.[/QUOTE]I noticed long since. In fact I had a discussion with the dev team about the switch.

Silence may be golden but it's not necessarily an indication of ignorance.

Dubslow 2016-03-16 07:33

[QUOTE=xilman;429305]I noticed long since. In fact I had a discussion with the dev team about the switch.

Silence may be golden but it's not necessarily an indication of ignorance.[/QUOTE]

May be :smile:

What was the discussion about? A link, perhaps?

xilman 2016-03-16 14:25

[QUOTE=Dubslow;429306]May be :smile:

What was the discussion about? A link, perhaps?[/QUOTE]Had to dig through the mail archives. There seems to have been several discussions.

One from November 2013 concerned putting system-constant headers (machine whitelists, thread numbers, etc) into a system.py and the factorization-dependent material into a separate file, both to be included by ecmfactor.py

Another is best summarized by
[quote]
On Sat, 2013-11-23 at 01:13 +0100, Alexander Kruppa wrote:

>
> The parameter files in the param/ directory are for the old Perl script, which we kept around as a legacy in 2.0. The parameter files for the Python script are in params_py/. By using the params.c131 file from that directory, it should work. This fact clearly needs to be documented better, and the error message when required parameters are missing in the parameter file could be a lot more descriptive.

Doh! I feel quite stupid now. Clearly I'm getting hard of thinking in
my old age.

Thanks Alex.
[/quote]

And a third from a month earlier concerned a bug in cadotask.py
[quote]
(I .cc PZ, he may be interested in this)

Oh. This is a limitation severe enough that it qualifies as a bug. I generate workunit names by concatenating the project name, the name of the task (such as "sieving") and the range (such as "1000000-1010000"), where the three pieces are separated by the underscore. I need to be able to parse the workunit name again later, which means splitting it into three pieces again. To ensure that this is possible uniquely, no underscore may appear in the project name. However I realize now that this was a poor choice, as underscore is used way too commonly to be restricted like this.

I think I will change it so that the split will happens only at the *last* two underscores in the workunit name; this way any underscores in the project name are ignored, and the other two parts (task name and range string) are generated by cadofactor.py itself so it is easy to guarantee that no underscore occurs among them.
[/quote]

EdH 2018-02-01 15:23

I know this thread is old, but it is specifically titled!

I am just now taking an interest in this package. Is this a good place for questions or would it be better to work via email at the site?

If interested, some background:

I have a set of scripts that allow me to distribute sieving across several machines and have set up mpi in the past, but even with Gigabit switches, once I moved past Pentium 4s, the LA couldn't be distributed efficiently. Additionally, I had scripts that distributed poly searches, although the effectiveness was questionable.

I would like to try to set up CADO-NFS across several LAN connected machines, all of which currently communicate among themselves via SSH.

I see that CADO-NFS uses http, so I "assume" Gigabit isn't required and as soon as I can figure out the little pieces, I expect I can work intraLAN-wise.

However, ATM, I have CADO-NFS running successfully (as far as factoring) on an i7 with 4 cores and 2 threads/core, but have not been able to get more than 2 cores/threads involved. I am using a parameter file (a modified copy of parameters), and have set tasks.threads=8, 4, 2 with no success, or even change. I've also tried -t 8 on the command line.

I have read through all the READMEs, this thread and several months of archives at the site, but the understanding part is still quite limited.

My current call, which does succeed, but with only 2 cores/threads:
[code]
./cado-nfs.py mathparameters
[/code]Via top, the machine "appears" to be running 2 cores with 2 threads per core. I see this in my terminal:
[code]
Info:root: No database exists yet
Info:Database: Opened connection to database /tmp/work//test_run5.db
Info:root: tasks.polyselect.threads = 2
Info:root: tasks.sieve.las.threads = 2
Info:root: Command line parameters: ./cado-nfs.py mathparameters
[/code]Thanks for any assistance...

VBCurtis 2018-02-01 17:32

Ed-
I should be able to help you via this thread. Are you using 2.3.0, or the current git repository? Most of my experience is with 2.3.0, though I have 3.0.0-beta via git on at least one machine also.

On my setup, the main machine ("server", and place where the postprocessing will run) is invoked with:

./cado-nfs.py b.c132 --client-threads=2 --server-threads=8

The param file b.c132 includes these lines:
server.whitelist = 169.254.0.0/16, 138.23.[my office IP]/16, 66.215.[my home IP]/16
server.ssl = no
server.port = [I choose the same 5-digit number for every job]
slaves.hostnames = localhost
slaves.nrclients = 4

My clients, which consist of 3 machines connected via LAN; my office machine, and my home machine, can all contribute to the job via this command:

./cado-nfs-client.py --server=http://[servername].local:[the 5-digit number you chose above for server port]

.local is for the LAN machines.

You should be able to run the client command from your i7 to add threads, if for some reason the initial job refuses to use more than 4. This is nice for varying dynamically the number of threads a CADO job uses; start the main process with 4, and run a client or two in other terminal windows to use 6 or 8 as desired.

EdH 2018-02-01 18:04

Thanks Curtis,

I actually considered PM'ing you, but chose this thread.

I am using 2.3.0. I thought I'd get that running and a little experience before heading for the latest.

I'm tied up (figuratively) for a few hours, but hopefully I can play this evening.

A few verifications/questions for now:

slaves.nrclients - should this be the maximum I'll allow or the expected number I'll run for the current instance?

client-threads - If I have 4 core single thread machines as clients, would I use 2 or 4 and if 2, then do I run two instances on the client?

If I do have clients running, do they have to be started and stopped for each different run of the server, or can they simply sit idle between jobs, waiting for the next server project?

Ed

VBCurtis 2018-02-01 22:17

Ed-
Slaves.nrclients is the number of child processes the CADO server runs on the machine localhost. Those aren't visible or controlled by you in any way once started, so consider it a minimum number to be running for the entire job. This setting has nothing to do with the number of client processes that may connect, on localhost or via your LAN (or, for that matter, the open internet if so whitelisted).

CADO docs indicate all processes are best run two-threaded at minimum; so a 4-core not-hyperthreaded machine (or a HT i3 two-core) would run two clients of two threads each. I believe client-threads = 2 is default, but I happen to have that in my invocation because I run client-threads = 3. I believe you can leave it off and everything will run two-threaded. I would run two instances of the cado-nfs-client.py window on each 4-core contributing machine.

Clients will idle between jobs; the default is to look for work every 10 seconds, but I'm sure that can be altered. However, when a job finishes sieving the server tells the client it is no longer needed, and the client exits out to terminal. The idle behavior is useful when you kill a server (for instance, to change parameters) and fire it back up within a few minutes; the clients will just idle and connect to the new job when it's available.

VBCurtis 2018-02-01 22:19

[QUOTE=VBCurtis;478996]You should be able to run the client command from your i7 to add threads, if for some reason the initial job refuses to use more than 4. This is nice for varying dynamically the number of threads a CADO job uses; start the main process with 4, and run a client or two in other terminal windows to use 6 or 8 as desired.[/QUOTE]

This part is misrepresented: When I said "start the main process with 4", I should have said "set slaves.nrclients to 2 in order to use 4 threads minimum, then fire up cado-nfs-client.py instances to add more pairs of threads to use 6 or 8 as desired."

EdH 2018-02-02 04:13

Thanks Curtis,

I seem to have the server running properly, but am still working on getting a client up and running. Actually, I just got a client online and I tried running it with the client invocation you gave. It couldn't communicate using the name so I tried the static ip. That time it "almost" worked, but the two machines got into an argument and the server "threw up its arms and quit!"
[code]
Error:Lattice Sieving: Program run on math24.97d94eca failed with exit code 1
Error:Lattice Sieving: Stderr output follows (stored in file /tmp/work/test_run9.upload/test_run9_sieving_3894000-3896000#2.vbo2b3pv.stderr0):
b"download/las: /usr/lib/x86_64-linux-gnu/libgomp.so.1: version `GOMP_4.0' not found (required by download/las)\ndownload/las: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by download/las)\ndownload/las: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by download/las)\ndownload/las: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.14' not found (required by download/las)\n"
Error:Lattice Sieving: Exceeded maximum number of failed workunits, maxfailed=100
[/code]The client went to idle.

I'll probably have to leave it for tonight and take a fresh look tomorrow. Have I failed to set some directories or install something? Am I missing the sievers?

Possibly of note, the client OS is Debian (on which I had to install Python3), and the server is Ubuntu. Tomorrow, if able, I will try to install the program on another Ubuntu i7 and see if it makes a difference.

On a different note:

Am I correct in thinking the polyselect is distributed as well as sieving? Is LA distributed also? If not, How large a number should I be able to handle with 12GB of RAM? (I might be able to increase that to 16GB, if necessary.)

Ed

Dubslow 2018-02-02 04:16

[url]https://askubuntu.com/questions/421642/libc-so-6-version-glibc-2-14-not-found[/url]

How old is that Debian client?


All times are UTC. The time now is 17:33.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.