mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Msieve

Reply
 
Thread Tools
Old 2014-04-26, 16:27   #12
mickfrancis
 
Apr 2014
Marlow, UK

5610 Posts
Default

Glad it was of some use. Thanks, by the way Jason for the great software and making it PD - very generous!
mickfrancis is offline   Reply With Quote
Old 2014-04-28, 15:20   #13
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

3·1,171 Posts
Default

Agreed, a nice update, although in some brief tests I saw essentially no change in speed.

I had forgotten that this core routine always did a prefetch. Modern processors tend do a much better job with cache management, and have much larger caches, than older processors did... so I tried simply commenting out the prefetch and saw a 8% speedup (tested on a c75, c80 and c82, on a Xeon E5-2687W).

This result will be very CPU dependent of course, but I think for most systems today the prefetch would not help much.
bsquared is offline   Reply With Quote
Old 2014-04-28, 16:08   #14
mickfrancis
 
Apr 2014
Marlow, UK

5610 Posts
Default

Nice one! I get 8% speed-up on a C80.

There is also a PREFETCH without an #ifdef MANUAL_PREFETCH in sieve_line.c. I haven't had a chance to see if this has any effect yet.
mickfrancis is offline   Reply With Quote
Old 2014-04-28, 16:22   #15
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

3·1,171 Posts
Default

Quote:
Originally Posted by mickfrancis View Post
Nice one! I get 8% speed-up on a C80.

There is also a PREFETCH without an #ifdef MANUAL_PREFETCH in sieve_line.c. I haven't had a chance to see if this has any effect yet.
sieve_line.c is msieve's NFS line siever, so it won't have any effect on QS timings. It may help improve the line siever, but on the other hand I doubt if anyone still uses it.
bsquared is offline   Reply With Quote
Old 2014-04-28, 19:52   #16
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

23·3·5·72 Posts
Default

Quote:
Originally Posted by bsquared View Post
sieve_line.c is msieve's NFS line siever, so it won't have any effect on QS timings. It may help improve the line siever, but on the other hand I doubt if anyone still uses it.
I don't think it works currently.
henryzz is offline   Reply With Quote
Old 2014-04-28, 21:17   #17
mickfrancis
 
Apr 2014
Marlow, UK

23·7 Posts
Default

Just got a 7% speed-up on a C100 by removing the PREFETCH, so looks like it scales, which is great. This is on an Intel i7-3770.
mickfrancis is offline   Reply With Quote
Old 2014-04-30, 11:54   #18
Nooby
 
Apr 2014

3×13 Posts
Default

windows:
stat() fails on large files with EOVERFLOW.

I see some workarounds in current code but mine actually work.

Code:
Index: savefile.c
===================================================================
--- savefile.c (revision 962)
+++ savefile.c (working copy)
@@ -95,8 +95,12 @@
  char *open_string;
 #ifndef NO_ZLIB
  char name_gz[256];
+#if defined(WIN32) || defined(_WIN64)
+ struct _stat64 dummy;
+#else
  struct stat dummy;
 #endif
+#endif
 
  if (flags & SAVEFILE_APPEND)
   open_string = "a";
@@ -111,8 +115,13 @@
 
 #ifndef NO_ZLIB
  sprintf(name_gz, "%s.gz", s->name);
+#if defined(WIN32) || defined(_WIN64)
+ if (_stat64(name_gz, &dummy) == 0) {
+  if (_stat64(s->name, &dummy) == 0) {
+#else
  if (stat(name_gz, &dummy) == 0) {
   if (stat(s->name, &dummy) == 0) {
+#endif
    printf("error: both '%s' and '%s' exist. "
           "Remove the wrong one and restart\n",
     s->name, name_gz);
@@ -189,8 +198,8 @@
 uint32 savefile_exists(savefile_t *s) {
  
 #if defined(WIN32) || defined(_WIN64)
- struct _stat dummy;
- return (_stat(s->name, &dummy) == 0);
+ struct _stat64 dummy;
+ return (_stat64(s->name, &dummy) == 0);
 #else
  struct stat dummy;
  return (stat(s->name, &dummy) == 0);
Code:
Index: util.c
===================================================================
--- util.c (revision 962)
+++ util.c (working copy)
@@ -477,20 +477,17 @@
 uint64 get_file_size(char *name) {
 
 #if defined(WIN32) || defined(_WIN64)
- WIN32_FILE_ATTRIBUTE_DATA tmp;
+ struct _stat64 tmp;
 
- if (GetFileAttributesEx((LPCTSTR)name, 
-   GetFileExInfoStandard, &tmp) == 0) {
+ if (_stat64(name, &tmp) != 0) {
   char name_gz[256];
   sprintf(name_gz, "%s.gz", name);
-  if (GetFileAttributesEx((LPCTSTR)name_gz,
-   GetFileExInfoStandard, &tmp) == 0)
+  if (_stat64(name_gz, &tmp) != 0)
    return 0;
-
-  return ((uint64)tmp.nFileSizeHigh << 32 | tmp.nFileSizeLow) << 1;
+  return (tmp.st_size / 11) * 20;
  }
 
- return (uint64)tmp.nFileSizeHigh << 32 | tmp.nFileSizeLow;
+ return tmp.st_size;
 
 #else
  struct stat tmp;
Nooby is offline   Reply With Quote
Old 2014-05-03, 02:30   #19
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3,541 Posts
Default

Thanks for the fixes. MinGW doesn't have the _stat64 function (discussion here); it would be preferable to use _stati64 instead.

Last fiddled with by jasonp on 2014-05-03 at 02:48
jasonp is offline   Reply With Quote
Old 2014-05-03, 15:10   #20
debrouxl
 
debrouxl's Avatar
 
Sep 2009

977 Posts
Default

Several weeks ago, I posted about a modest (~2%), but seemingly consistent optimization to filtering:
Code:
Index: gnfs/relation.c
===================================================================
--- gnfs/relation.c	(révision 961)
+++ gnfs/relation.c	(copie de travail)
@@ -16,7 +16,11 @@
 #include "gnfs.h"
 
 /*--------------------------------------------------------------------*/
-static uint32 divide_factor_out(mpz_t polyval, uint64 p, 
+static inline
+#ifdef __GNUC__
+__attribute__((always_inline))
+#endif
+uint32 divide_factor_out(mpz_t polyval, uint64 p, 
 				uint8 *factors, uint32 *array_size_in,
 				uint32 *num_factors, uint32 compress,
 				mpz_t tmp1, mpz_t tmp2, mpz_t tmp3) {
See http://www.mersenneforum.org/showpos...7&postcount=14 for numbers and details, and a question
debrouxl is offline   Reply With Quote
Old 2014-05-03, 15:31   #21
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3,541 Posts
Default

Sorry Lionel, I read your original post but forgot to fold it in. Will do so soon.
jasonp is offline   Reply With Quote
Old 2014-05-15, 09:22   #22
mickfrancis
 
Apr 2014
Marlow, UK

23·7 Posts
Default 6% improvement in SIQS time

I appreciate that people are more concerned with GNFS than SIQS, but I just thought I'd mention that I've got a speedup of around 6% (for C80, C90 and C100) by making the following change in sieve.c (conditional on __MDF__):

Code:
	/* choose the largest factor base prime that does not
	   use hashtables or reciprocals. We could make this
	   range empty, but it's better to start using hashtables
	   with primes somewhat larger than the block size.
	   This greatly reduces the amount of memory used by
	   the hashtables, and trial factoring by primes in this
	   range is extremely cheap */

	for (; i < fb_size && conf.factor_base[i].prime <
#if defined(__MDF__)
            sieve_block_size; i++) {
#else
            3 * sieve_block_size; i++) {
#endif
		/* nothing */
	}
I also tried using sieve_block_size/2, but that degrades performance. This is on an Intel i7-3770. Testing was after removing the PREFETCH in add_to_hashtable() in sieve_core.c, which may be important as that function will be more heavily loaded by this change if I'm not mistaken.

Cheers,

Mick.
mickfrancis is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Source code to mprime 289 out there somewhere? graysky Linux 6 2016-04-25 23:03
Source Code for msieve ? mohamed Msieve 8 2013-12-14 01:04
Source code for my program Sam Kennedy Factoring 7 2012-11-22 18:24
llrnet - source code? reezer Prime Sierpinski Project 11 2009-09-11 10:47
Support for other OSs on x86/source code reezer Software 1 2007-02-08 12:57

All times are UTC. The time now is 00:51.


Sat Jul 17 00:51:54 UTC 2021 up 49 days, 22:39, 1 user, load averages: 1.57, 1.52, 1.42

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.