mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2010-07-13, 16:03   #320
Aillas
 
Aillas's Avatar
 
Oct 2002
France

2×3×23 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Hi Aillas,


no, not really.
Did you try the examples from the CUDA SDK?
Oliver
No, just take mfakt sources and compile it. Maybe QUATTRO cards are a bit different. I will test the next version

I will try sample later...

Thanks.
Aillas is offline   Reply With Quote
Old 2010-07-27, 07:40   #321
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11×101 Posts
Default

Hello everybody,

find attached mfaktc 0.10.

Highlights of this version:
- two new runtime options: Stages and StopAfterFactor (see mfaktc.ini for details)
- modified the stream scheduling (suggested by Ethan). Older versions assumed the the streams are processed in the way there were issued. The new way improves the performance a little bit in some cases (e.g. multiple instances of mfaktc) and narrows the gap between Windows and Linux (but doesn't solve the Windows / CUDA 3.1 / 25x.xx driver bug?! )
- threads per grid is determined during runtime based on the number of multiprocessors of the GPU. This was necessary since Nvidia releases more and more GPUs with a non-power-of-two number of multiprocessors...

Oliver
Attached Files
File Type: gz mfaktc-0.10.tar.gz (86.9 KB, 111 views)
TheJudger is offline   Reply With Quote
Old 2010-07-27, 08:15   #322
Karl M Johnson
 
Karl M Johnson's Avatar
 
Mar 2010

1100110112 Posts
Default

Looking forward for a x64 binary for sm_11 arch

Last fiddled with by Karl M Johnson on 2010-07-27 at 08:18
Karl M Johnson is offline   Reply With Quote
Old 2010-07-27, 08:46   #323
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

61×79 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Hello everybody,

find attached mfaktc 0.10.

Highlights of this version:
- two new runtime options: Stages and StopAfterFactor (see mfaktc.ini for details)
- modified the stream scheduling (suggested by Ethan). Older versions assumed the the streams are processed in the way there were issued. The new way improves the performance a little bit in some cases (e.g. multiple instances of mfaktc) and narrows the gap between Windows and Linux (but doesn't solve the Windows / CUDA 3.1 / 25x.xx driver bug?! )
- threads per grid is determined during runtime based on the number of multiprocessors of the GPU. This was necessary since Nvidia releases more and more GPUs with a non-power-of-two number of multiprocessors...

Oliver
I assume that now I can finish my exponents after having switched to 0.10... and have a little boost?

Luigi
ET_ is offline   Reply With Quote
Old 2010-07-27, 09:25   #324
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

111110 Posts
Default

Hello Luigi,

Quote:
Originally Posted by ET_ View Post
I assume that now I can finish my exponents after having switched to 0.10... and have a little boost?
mfaktc only accepts checkpoint files which were written by the same version. So when you've started your test with 0.09 you have to finish it with 0.09 or restart from scratch with 0.10.

About the little performance boost... you're talking about your GTX 275 and running Linux? I would that this is not a configuration where you'll see the little improvement.

Spoiler alert:
But don't be too sad, you'll see an improvement with the next version! 0.11 has a faster sieve. I've tested SievePrimes 20.000, 30.000 and 40.000, in all cases the new sieve does ~25% more throughput (good for users with GTX 4xx).
On the other hand this could be used to increase SievePrimes which will remove more candidates during sieving. On my system the sieve of 0.10 with SievePrimes=20.000 is capable to generate ~89M/s candidates and the new 0.11 does the same speed with SievePrimes=40.000 (which yields 3-4% more candidates removed ==> 3-4% overall speed increase).

Now the bad news:
- 0.11 need more testing
- you have to wait a little bit

Oliver
TheJudger is offline   Reply With Quote
Old 2010-07-28, 11:30   #325
Aillas
 
Aillas's Avatar
 
Oct 2002
France

2128 Posts
Default

Hi,

the new version is working on my config:
Config:
Ubuntu 10.04
nvidia driver 256.35
CUDA 3.1
GPU: NVIDIA QUATTRO 140M

So for now I'm trying it on 3321931967,76,77 (I need to reserve it).

I have a question. When I run mfaktc 0.10, my computer is unusable. It seems mfacktc is using all GPU power and so, I can't use my computer (to login or to open a window).
I also notice that it use one of the core to 100%. Is it normal?

Thanks

Last fiddled with by Aillas on 2010-07-28 at 11:31
Aillas is offline   Reply With Quote
Old 2010-07-28, 12:52   #326
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

61×79 Posts
Default

Quote:
Originally Posted by Aillas View Post
Hi,

the new version is working on my config:
Config:
Ubuntu 10.04
nvidia driver 256.35
CUDA 3.1
GPU: NVIDIA QUATTRO 140M

So for now I'm trying it on 3321931967,76,77 (I need to reserve it).

I have a question. When I run mfaktc 0.10, my computer is unusable. It seems mfacktc is using all GPU power and so, I can't use my computer (to login or to open a window).
I also notice that it use one of the core to 100%. Is it normal?

Thanks
Hi Aillas, I presume that you are running Ubuntu 64 bits.

I experimented the same behavior when using mfaktc with Ubuntu_64 9.10 and nVidia GTX 275: the graphical interface is nearly unusable. It doesn't affect my life, thanks to the resume file: when I need to access my desktop, I turn mfaktc off...

As for the CPU usage, the program uses two sections: one runs on the GPU, the other (the siever IIRC) tries to keep up with the GPU, preparing presieved intervals of testing factors.
So in short, yes, it is quite normal that one core is kept busy during elaboration.

Now check out your exponent here, and remember to post your results here. Good luck!

Luigi

Last fiddled with by ET_ on 2010-07-28 at 12:57
ET_ is offline   Reply With Quote
Old 2010-07-28, 12:58   #327
Aillas
 
Aillas's Avatar
 
Oct 2002
France

2×3×23 Posts
Default

Quote:
Originally Posted by ET_ View Post
Hi Aillas, I presume that you are running Ubuntu 64 bits.

I experimented the same behavior when using mfaktc with Ubuntu_64 9.10 and nVidia GTX 275: the graphical interface is nearly unusable. It doesn't affect my life, thanks to the resume file: when I need to access my desktop, I turn mfaktc off...

As for the CPU usage, the program uses two sections: one runs on the GPU, the other (the siever IIRC) tries to keep up with the GPU, preparing presieved intervals of testing factors.
So in short, yes, it is quite normal that one core is kept busy during elaboration.

Now check out your exponent here, and remember to post your results here. Good luck!

Luigi
Thanks for the information. Ok, I will let it run so.
PS: It's the 32 bit version of Ubuntu. Is it a problem ?
Aillas is offline   Reply With Quote
Old 2010-07-28, 13:01   #328
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11·101 Posts
Default

Hello Aillas,

Quote:
Originally Posted by Aillas View Post
I have a question. When I run mfaktc 0.10, my computer is unusable. It seems mfacktc is using all GPU power and so, I can't use my computer (to login or to open a window).
I also notice that it use one of the core to 100%. Is it normal?
100% CPU usage is normal. This is a bit ugly on "slow GPUs" because it wastes CPU cycles but on "fast GPUs" this is OK because you can't have enough CPU power for the sieve on "fast GPUs".

About the unusability... this is normal, too. You can try to
- lower the THREADS_PER_GRID_MAX to e.g. 1<<16 (params.h, requiers recompile)
- run only one stream (NumStreams=1 in mfaktc.ini)

This seems to depend on the GPU, too. "slow GPUs" have a higher runtime on a single kernel launch and the GPU can only process one thing at once... so there are no GUI updates while the kernel runs. Faster GPU = lower runtime per kernel => more GUI updates per second.
It seems that those Geforce 4xx series are much better than their predecessors in this situation.

I know, this doesn't help you, sorry.

Oliver

Last fiddled with by TheJudger on 2010-07-28 at 13:01
TheJudger is offline   Reply With Quote
Old 2010-07-28, 13:04   #329
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

45716 Posts
Default

Quote:
Originally Posted by Aillas View Post
PS: It's the 32 bit version of Ubuntu. Is it a problem ?
It should work on 32bit, too. I tried some 32bit builds on my 64bit Linux and they seem to work as expected. Just the sieve is ~33% slower on 32bit but this won't hurt you. I think you're hitting SievePrimes=100000 easily, right?

Oliver

Last fiddled with by TheJudger on 2010-07-28 at 13:04
TheJudger is offline   Reply With Quote
Old 2010-07-28, 13:09   #330
Aillas
 
Aillas's Avatar
 
Oct 2002
France

2×3×23 Posts
Default

This is the standard behavior. So it's ok for me. I didn't want to run the program many days for nothing.

Now, I'm curious how many days it will take to sieve 3321931967 from 76 to 77 bit on a Quatro 140 M.

Maybe next time I should try exponants in the new lower range...

Thanks for your support.

Ludovic
Aillas is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 20:56.


Fri Aug 6 20:56:06 UTC 2021 up 14 days, 15:25, 1 user, load averages: 2.92, 2.60, 2.60

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.