mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2012-01-13, 11:45   #133
AG5BPilot
 
AG5BPilot's Avatar
 
Dec 2011
New York, U.S.A.

9710 Posts
Default

Quote:
Originally Posted by msft View Post
Linux need #127 patch with CPU TIME 100% issue.
Win not need #127 patch.
Shoichiro,

I don't understand why your fix would work. Is the only change you made to add an extra DeviceToHost copy of maxErr?

But, I did see that the init_device() routine makes 4 cuda device calls, and unlike the rest of the software, it does not use cutilSafeCall() to check for errors.

In particular, I think this call may be important:

Code:
cudaSetDeviceFlags(cudaDeviceBlockingSync);
If that were to fail for any reason, CUDA would act as if you were in cudaDeviceScheduleSpin mode, and that would cause the 100% CPU utilization that you're seeing.

However, I do not understand why that call would be failing.

I would suggest adding cutilSafeCall() to the API calls inside init_device() function. Seems like a good thing to do regardless of whether this is related to the 100% CPU core problem. I'm adding that to my version of the code.

Mike
AG5BPilot is offline   Reply With Quote
Old 2012-01-13, 12:35   #134
msft
 
msft's Avatar
 
Jul 2009
Tokyo

26216 Posts
Default

Hi ,AG5BPilot
Quote:
Originally Posted by AG5BPilot View Post
I don't understand why your fix would work. Is the only change you made to add an extra DeviceToHost copy of maxErr?
Yes.
I think this code will trigger.
Quote:
Originally Posted by AG5BPilot View Post
But, I did see that the init_device() routine makes 4 cuda device calls, and unlike the rest of the software, it does not use cutilSafeCall() to check for errors.
In particular, I think this call may be important:
Code:
cudaSetDeviceFlags(cudaDeviceBlockingSync);
If that were to fail for any reason, CUDA would act as if you were in cudaDeviceScheduleSpin mode, and that would cause the 100% CPU utilization that you're seeing.
However, I do not understand why that call would be failing.
I think,if that were to fail,"extra DeviceToHost copy" should not work.
Quote:
Originally Posted by AG5BPilot View Post
I would suggest adding cutilSafeCall() to the API calls inside init_device() function. Seems like a good thing to do regardless of whether this is related to the 100% CPU core problem. I'm adding that to my version of the code.
The results did not change.

Anyway strange phenomenon.
msft is offline   Reply With Quote
Old 2012-01-24, 11:28   #135
msft
 
msft's Avatar
 
Jul 2009
Tokyo

26216 Posts
Default

Hi ,
Ver 1.06,1.061,1.062 have fatal error with 32bit OS.
I made bug around GMP code.
Do not use with 32bit OS.
Thank you,
msft is offline   Reply With Quote
Old 2012-01-25, 11:28   #136
rroonnaalldd
 
rroonnaalldd's Avatar
 
Dec 2011

2·7 Posts
Default

Quote:
Originally Posted by msft View Post
Hi ,
Ver 1.06,1.061,1.062 have fatal error with 32bit OS.
I made bug around GMP code.
Do not use with 32bit OS.
Thank you,
Which bug?
I am getting no error here or in llrCUDA. I have Lubuntu11.10 with kernel 3.0.0-15.26 as compiling-platform.
rroonnaalldd is offline   Reply With Quote
Old 2012-01-25, 12:38   #137
msft
 
msft's Avatar
 
Jul 2009
Tokyo

2·5·61 Posts
Default

Hi ,
llrCUDA was OK.
Code:
                m_b = b;
                mpz_init_set_ui(m_Na,m_b);
                for(j = m; j != 1; j/=2)
                        mpz_mul(m_Na,m_Na,m_Na);
                m_Na_size = mpz_size(m_Na);
                m_Na_size_byte = m_Na_size*sizeof(long int);
                m_a = (long int*) malloc(m_Na_size*sizeof(long int));
                mpz_export(m_a,NULL,0,m_Na_size_byte,0,0,m_Na);
Sometime m_a[m_Na_size-1]==0 with linux32.
Code:
                if(m_a_32[m_a_32_len-1]== 0) m_a_32_len--;
This code not enough.
msft is offline   Reply With Quote
Old 2012-01-26, 03:44   #138
msft
 
msft's Avatar
 
Jul 2009
Tokyo

61010 Posts
Default

Hi ,
mini test program.
Code:
#include <stdio.h>
#include <gmp.h>
int main()
{
        mpz_t m;
        int i,j;
        unsigned long long e_m[10];
        mpz_init_set_ui(m,2);
        for(j = 0; j != 8; j++)
        {
                mpz_mul(m,m,m);
                for(i=0;i<10;i++)e_m[i]=0;
                mpz_export(e_m,NULL,0,80,0,0,m);
                printf(" mpz_size(m)=%d ",(int) mpz_size(m));
                for(i=9;i>=0;i--) printf(" %llx",e_m[i]);
                printf("\n");
        }
        mpz_clear(m);
}
linux32:
Code:
$ ./a.out
 mpz_size(m)=1  0 0 0 0 0 0 0 0 0 4
 mpz_size(m)=1  0 0 0 0 0 0 0 0 0 10
 mpz_size(m)=1  0 0 0 0 0 0 0 0 0 100
 mpz_size(m)=1  0 0 0 0 0 0 0 0 0 10000
 mpz_size(m)=2  0 0 0 0 0 0 0 0 0 100000000
 mpz_size(m)=3  0 0 0 0 0 0 0 0 1 0
 mpz_size(m)=5  0 0 0 0 0 0 0 1 0 0
 mpz_size(m)=9  0 0 0 0 0 1 0 0 0 0
linux64:
Code:
$ ./a.out
 mpz_size(m)=1  0 0 0 0 0 0 0 0 0 4
 mpz_size(m)=1  0 0 0 0 0 0 0 0 0 10
 mpz_size(m)=1  0 0 0 0 0 0 0 0 0 100
 mpz_size(m)=1  0 0 0 0 0 0 0 0 0 10000
 mpz_size(m)=1  0 0 0 0 0 0 0 0 0 100000000
 mpz_size(m)=2  0 0 0 0 0 0 0 0 1 0
 mpz_size(m)=3  0 0 0 0 0 0 0 1 0 0
 mpz_size(m)=5  0 0 0 0 0 1 0 0 0 0
msft is offline   Reply With Quote
Old 2012-01-28, 02:12   #139
rroonnaalldd
 
rroonnaalldd's Avatar
 
Dec 2011

2·7 Posts
Default

Hmm, got it not compiled:
Quote:
boinc@Lubuntu32:~/Cuda$ gcc a
a: In function `main':
a.c:(.text.startup+0x25): undefined reference to `__gmpz_init_set_ui'
a.c:(.text.startup+0x3c): undefined reference to `__gmpz_mul'
a.c:(.text.startup+0x110): undefined reference to `__gmpz_export'
a.c:(.text.startup+0x2ba): undefined reference to `__gmpz_clear'
collect2: ld returned 1 exit status
rroonnaalldd is offline   Reply With Quote
Old 2012-01-28, 02:23   #140
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

143248 Posts
Default

Quote:
Originally Posted by rroonnaalldd View Post
Hmm, got it not compiled:
Add -lgmp
rogue is offline   Reply With Quote
Old 2012-01-28, 03:18   #141
msft
 
msft's Avatar
 
Jul 2009
Tokyo

11428 Posts
Default

Code:
mpz_mul(m_Na,m_Na,m_Na);
m_Na_size = (mpz_sizeinbase(m_Na,2)+sizeof(long long int)*8-1)/(sizeof(long long int)*8);
m_Na_size_byte = m_Na_size*sizeof(long long int);
It's only answer.
msft is offline   Reply With Quote
Old 2012-01-28, 10:04   #142
rroonnaalldd
 
rroonnaalldd's Avatar
 
Dec 2011

2×7 Posts
Default

Quote:
Originally Posted by rogue View Post
Add -lgmp




Lubuntu11.10-32
Quote:
boinc@Lubuntu32:~/Cuda$ ./a32.out
mpz_size(m)=1 0 0 0 0 0 0 0 0 0 4
mpz_size(m)=1 0 0 0 0 0 0 0 0 0 10
mpz_size(m)=1 0 0 0 0 0 0 0 0 0 100
mpz_size(m)=1 0 0 0 0 0 0 0 0 0 10000
mpz_size(m)=2 0 0 0 0 0 0 0 0 0 100000000
mpz_size(m)=3 0 0 0 0 0 0 0 0 1 0
mpz_size(m)=5 0 0 0 0 0 0 0 1 0 0
mpz_size(m)=9 0 0 0 0 0 1 0 0 0 0
DotschUX-64 based on Ubuntu8.10
Quote:
boinc@vmware2k-3:~/Cuda$ ./a32.out
mpz_size(m)=1 0 0 0 0 0 0 0 0 0 4
mpz_size(m)=1 0 0 0 0 0 0 0 0 0 10
mpz_size(m)=1 0 0 0 0 0 0 0 0 0 100
mpz_size(m)=1 0 0 0 0 0 0 0 0 0 10000
mpz_size(m)=2 0 0 0 0 0 0 0 0 0 100000000
mpz_size(m)=3 0 0 0 0 0 0 0 0 1 0
mpz_size(m)=5 0 0 0 0 0 0 0 1 0 0
mpz_size(m)=9 0 0 0 0 0 1 0 0 0 0
Quote:
boinc@vmware2k-3:~/Cuda$ ./a64.out
mpz_size(m)=1 0 0 0 0 0 0 0 0 0 4
mpz_size(m)=1 0 0 0 0 0 0 0 0 0 10
mpz_size(m)=1 0 0 0 0 0 0 0 0 0 100
mpz_size(m)=1 0 0 0 0 0 0 0 0 0 10000
mpz_size(m)=1 0 0 0 0 0 0 0 0 0 100000000
mpz_size(m)=2 0 0 0 0 0 0 0 0 1 0
mpz_size(m)=3 0 0 0 0 0 0 0 1 0 0
mpz_size(m)=5 0 0 0 0 0 1 0 0 0 0

Last fiddled with by rroonnaalldd on 2012-01-28 at 10:17 Reason: a64-output added
rroonnaalldd is offline   Reply With Quote
Old 2012-07-23, 08:21   #143
axn
 
axn's Avatar
 
Jun 2003

508510 Posts
Default

does anyone have any idea of the relative time spent of "fft square" routine vs "next step" kernels in genefer?
axn is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Genefer's FFT applied to Mersenne squaring preda Software 0 2017-09-06 02:54
CUDA 5.5 ET_ GPU Computing 2 2013-06-13 15:50
AVX CPU LL vs CUDA LL nucleon GPU Computing 11 2012-01-04 17:52
Best CUDA GPU for the $$ Christenson GPU Computing 24 2011-05-01 00:06
CUDA? Xentar Conjectures 'R Us 6 2010-03-31 07:43

All times are UTC. The time now is 05:55.


Fri Aug 6 05:55:29 UTC 2021 up 14 days, 24 mins, 1 user, load averages: 3.37, 3.48, 3.21

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.