mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2016-07-30, 13:55   #2608
mattmill30
 
Aug 2015

2×23 Posts
Default

FYI, I have completed the lost TF work and the checkpoint reads:
Code:
M332347303 77 81 4620 0.21: 1808 0 5FCDA1FC
mattmill30 is offline   Reply With Quote
Old 2016-08-02, 03:44   #2609
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

926510 Posts
Default

Ok, sorry I didn't have time to revisit this topic, in fact I didn't consider it priority anymore, because I saw you redone the work anyhow. Just to do a knot on the lose ends, here is the code that does the checksum for version 0.21, also copied from Oliver's code which is available on web, I only replaced scanf/open/etc with their "safe" versions to avoid vc++ making a big scandal of it...

Version 0.21 added that "M" in front, to distinguish from "W" when mfaktc is used for Wagstaff numbers. Therefore the difference in the file. This code does generate checksums as you expect (and matching what you posted here, I tested it). To generate checksums for wagstaff numbers, you have to modify the define (or define WAGSTAFF).

Code:
#include "stdafx.h"

#define NUM_CLASSES                4620
#define MFAKTC_VERSION            "0.21"

#ifdef WAGSTAFF
    #define NAME_NUMBERS        "W"
#else /* Mersennes */
    #define NAME_NUMBERS        "M"
#endif 

unsigned int checkpoint_checksum(char *string, int chars)
/* generates a CRC-32 like checksum of the string */
{
    unsigned int chksum = 0;
    int i, j;

    for (i = 0; i<chars; i++)
    {
        for (j = 7; j >= 0; j--)
        {
            if ((chksum >> 31) == (((unsigned int)(string[i] >> j)) & 1))
            {
                chksum <<= 1;
            }
            else
            {
                chksum = (chksum << 1) ^ 0x04C11DB7;
            }
        }
    }
    return chksum;
}

// writes the checkpoint file
void checkpoint_write(unsigned int exp, int bit_min, int bit_max, int cur_class, int num_factors)
{
    FILE *f;
    char buffer[100], filename[20];
    unsigned int i;

    sprintf_s(filename, "%s%u.ckp", NAME_NUMBERS, exp);

    fopen_s(&f, filename, "w");
    if (f == NULL)
    {
        printf("WARNING, could not write checkpoint file \"%s\"\n", filename);
    }
    else
    {
        sprintf_s(buffer, "%s%u %d %d %d %s: %d %d", NAME_NUMBERS, exp, bit_min, bit_max, NUM_CLASSES, MFAKTC_VERSION, cur_class, num_factors);
        i = checkpoint_checksum(buffer, strlen(buffer));
        fprintf(f, "%s%u %d %d %d %s: %d %d %08X", NAME_NUMBERS, exp, bit_min, bit_max, NUM_CLASSES, MFAKTC_VERSION, cur_class, num_factors, i);
        fclose(f);
    }
}

//=======================================================
int _tmain(int argc, _TCHAR* argv[])
{
    unsigned int exp;
    int bmin, bmax, cls;
    char ch;

    printf("Exponent      : "); scanf_s("%u", &exp);
    printf("From bitlevel : "); scanf_s("%d", &bmin);
    printf("To bitlevel   : "); scanf_s("%d", &bmax);
    printf("Current class : "); scanf_s("%d", &cls);
    checkpoint_write(exp, bmin, bmax, cls, 0);  //assume no factors were found by former runs
    printf("\nDone. Use it at your own risk...\nPress a key to exit.");
    ch=_getch();
    return 0;
}
Code:
M332347303 77 81 4620 0.21: 1808 0 5FCDA1FC
LaurV is offline   Reply With Quote
Old 2016-08-13, 22:11   #2610
mattmill30
 
Aug 2015

2×23 Posts
Default Feature request: -tf extention, resume bit-range from particular class

Feature request:
Expansion of -tf switch to include support for beginning from a particular class.

This feature has at least two real world applications:
  1. Resuming from the last checked class following a checksum write error
  2. Resuming from a particular class following the successful discovery of a factor, in order to complete the bit-range

Additionally, if it is trivial to implement, then the ability to resume from the bit-range and class in which a factor exists. I'm not sure how this would work along-side compound factors. An example for this usage would be when attempting to complete any remaining factorisation of an exponent such as M9100919, where no bit-ranges have been included with factor submissions.
mattmill30 is offline   Reply With Quote
Old 2016-08-13, 22:23   #2611
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11×101 Posts
Default

Hi,

Quote:
Originally Posted by mattmill30 View Post
Resuming from a particular class following the successful discovery of a factor, in order to complete the bit-range
Short: not possible!
Long: not possible, because we don't know which application reported the factor, which settings where used, etc. Prime95 splits the search space in residue classes mod 96(?) over the factor candidates (FCs) while mfaktc can do residue classes mod 420 or 4620 over the k in FC = 2kp+1.

Oliver

Last fiddled with by TheJudger on 2016-08-13 at 22:24
TheJudger is offline   Reply With Quote
Old 2016-08-13, 22:43   #2612
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

11100101101102 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Prime95 splits the search space in residue classes mod 96(?)
mod 120
Prime95 is offline   Reply With Quote
Old 2016-08-14, 12:10   #2613
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

100010101112 Posts
Default

Thank you for correction. It was too late yesterday. I know the numbers for mfaktc and I know Prime95 uses somewhat less residues classes but had the wrong number in my mind.
TheJudger is offline   Reply With Quote
Old 2016-09-09, 06:57   #2614
ji2my
 
Sep 2016

3 Posts
Default ERROR: cudaGetLastError() returned 8: invalid device function

Hi,

I've encounter a error, can anyone help me to solve it?

Thanks!


D:\mfaktc>mfaktc-win-64.exe
mfaktc v0.21 (64bit built)

Compiletime options
THREADS_PER_BLOCK 256
SIEVE_SIZE_LIMIT 32kiB
SIEVE_SIZE 193154bits
SIEVE_SPLIT 250
MORE_CLASSES enabled

Runtime options
SievePrimes 25000
SievePrimesAdjust 1
SievePrimesMin 5000
SievePrimesMax 100000
NumStreams 3
CPUStreams 3
GridSize 3
GPU Sieving enabled
GPUSievePrimes 82486
GPUSieveSize 64Mi bits
GPUSieveProcessSize 16Ki bits
Checkpoints enabled
CheckpointDelay 30s
WorkFileAddDelay 600s
Stages enabled
StopAfterFactor bitlevel
PrintMode full
V5UserID (none)
ComputerID (none)
AllowSleep no
TimeStampInResults no

CUDA version info
binary compiled for CUDA 6.50
CUDA runtime version 6.50
CUDA driver version 8.0

CUDA device info
name GeForce GTX 1070
compute capability 6.1
max threads per block 1024
max shared memory per MP 98304 byte
number of multiprocessors 15
clock rate (CUDA cores) 1708MHz
memory clock rate: 4004MHz
memory bus width: 256 bit

Automatic parameters
threads per grid 983040
GPUSievePrimes (adjusted) 82486
GPUsieve minimum exponent 1055144

running a simple selftest...
ERROR: cudaGetLastError() returned 8: invalid device function

D:\mfaktc>
ji2my is offline   Reply With Quote
Old 2016-09-09, 18:21   #2615
KaptainBlaZzed
 
Sep 2007

22 Posts
Default

i tried this on my 1080 and i get the error
"ERROR: Cudagetlasterror() returned: 8 invalid device function"

Can you upgrade the program to function with pascal and the new CUDA architecture?
KaptainBlaZzed is offline   Reply With Quote
Old 2016-09-09, 19:27   #2616
airsquirrels
 
airsquirrels's Avatar
 
"David"
Jul 2015
Ohio

11·47 Posts
Default

You must compile for the specific compute version and CUDA version of the card you are using. In this case the 8.0 RC and compute 6.1. Each generation of GPUs requires a separate build

Last fiddled with by airsquirrels on 2016-09-09 at 19:27
airsquirrels is offline   Reply With Quote
Old 2016-09-09, 19:43   #2617
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

2×2,909 Posts
Default

Quote:
Originally Posted by airsquirrels View Post
You must compile for the specific compute version and CUDA version of the card you are using. In this case the 8.0 RC and compute 6.1. Each generation of GPUs requires a separate build
???
I have used old binaries with my 750Ti
henryzz is offline   Reply With Quote
Old 2016-09-09, 20:59   #2618
airsquirrels
 
airsquirrels's Avatar
 
"David"
Jul 2015
Ohio

11×47 Posts
Default

Quote:
Originally Posted by henryzz View Post
???
I have used old binaries with my 750Ti
Some of the older cards/CUDA versions supported multiple compute versions and architectures, but Maxwell and Pascal both seem to required specific builds.
airsquirrels is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1668 2020-12-22 15:38
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 23:01.

Thu Feb 25 23:01:58 UTC 2021 up 84 days, 19:13, 0 users, load averages: 1.20, 1.39, 1.48

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.