mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2013-01-10, 00:58   #2069
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

2×4,909 Posts
Default

Quote:
Originally Posted by LaurV View Post
Time to make Uncwilly happy...
I noticed some significant effort mystically showing up on some exponents in the last day or so.
Attached Thumbnails
Click image for larger version

Name:	you_are_being_monitored_office_humor_postcard-p239589917277232493baanr_400.jpg
Views:	98
Size:	32.8 KB
ID:	9113  
Uncwilly is offline   Reply With Quote
Old 2013-01-10, 01:05   #2070
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23×271 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
I noticed some significant effort mystically showing up on some exponents in the last day or so.
kracker is offline   Reply With Quote
Old 2013-01-10, 11:10   #2071
lycorn
 
lycorn's Avatar
 
"GIMFS"
Sep 2002
Oeiras, Portugal

3·491 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
GPU sieving is not enabled below 264. Not only that, but it uses older, less-optimized kernels that are inherently slower.
OK, that explains part of the problem (most of it, actually). Also, I was running the 32-bit version, as I was expecting the sieve to be run on the GPU. The SievePrimesMin parameter was at the default 5000.
Running the 64-bit app, and setting the sievePrimes to 2000 provided the same throughput as 0.19, as expected. The GHz-d/d were roughly half of what is obtained when testing mainstream exponents.
That said, I don´t think I´ll be testing small exponents anymore (at least until some new version pops up).
lycorn is offline   Reply With Quote
Old 2013-01-10, 17:27   #2072
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

5·479 Posts
Default

I have two more suggestions for the documentation:

1. For people who aren't familiar with console applications, it would be useful for them to know that pressing Ctrl-C terminates the program smoothly.
2. I'm surprised there is no mention of GPU to 72.

Last fiddled with by ixfd64 on 2013-01-10 at 17:27
ixfd64 is offline   Reply With Quote
Old 2013-01-10, 18:01   #2073
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

9,767 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
2. I'm surprised there is no mention of GPU to 72.
Oliver asked me to provide some language. Unfortunately something came up which took my mind off the deliverable in time. Next release.
chalsall is offline   Reply With Quote
Old 2013-01-11, 04:42   #2074
swl551
 
swl551's Avatar
 
Aug 2012
New Hampshire

23×101 Posts
Default 0.20 unstable at gpu/OC levels that were fine with 0.19

GTX 570 and 0.19 I could run 4 instances on one card clock at 1000mv, 900mhz core. Average combined throughput was 480 ghz per day. Never crashed...

020 has forced drop down to 988mv (default) and 845mhz core to stay reliable. Reducing throughput to only 420 ghz per day. Confirmed on 3 different 570s on different PCs. The oddest thing is that after mfaktc crashes the GPU core clock will NOT go over 405mhz regardless of what I do with afterBurner. I have to reboot to allow the card to return to factory clock speed. This is a condition I have never seen before (below factory clocks)

I recognize all the benefits of 0.20 so this is not a 0.20 vs 0.19. The question is specifically why is 0.20 showing instability where 0.19 did not.

thanks

Scott
swl551 is offline   Reply With Quote
Old 2013-01-11, 04:48   #2075
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

Quote:
Originally Posted by swl551 View Post
The question is specifically why is 0.20 showing instability where 0.19 did not.
Possibly simply because the GPU is under more stress now. Depending on the CPU behind those 4 instances, that might not have been enough to truly saturate the card, where now 0.20 can do that thanks to the GPU sieving. What's the Eq. GHz with 0.20 at factory clock, vs. the Eq. GHz with 0.19 at factory clock/4 instances?
Dubslow is offline   Reply With Quote
Old 2013-01-11, 05:32   #2076
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

72·197 Posts
Default

Plus one for what Dubslow says. Same story as for P95 SSE versus P95 AVX, the older one could stand tremendous overclocks (over 4.5G for i7-2600k) but for the last AVX versions, which use the CPU better and squeeze it harder, producing a lot more heat, even with my water cooling racks, I had to reduce the clock to get stable results.

Last fiddled with by LaurV on 2013-01-11 at 05:35
LaurV is online now   Reply With Quote
Old 2013-01-11, 05:33   #2077
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

5·479 Posts
Default

As I mentioned earlier, mfaktc 0.20 is about three times as fast as 0.19 on my GTX 555. Jobs that previously took 100 minutes to complete now finish in just a little over half an hour. But even more surprising is that the average rate skyrocketed from 100M/s to around 933M/s.

I know the number of candidates per second doesn't matter, but the figures I'm getting are quite... shocking. Is this normal?
ixfd64 is offline   Reply With Quote
Old 2013-01-11, 05:38   #2078
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

965310 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
Is this normal?
Definitively yes. But take the enthusiasm with a grain of salt, see my post #2057.
LaurV is online now   Reply With Quote
Old 2013-01-11, 10:29   #2079
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11·101 Posts
Default

Hi,

Quote:
Originally Posted by ixfd64 View Post
As I mentioned earlier, mfaktc 0.20 is about three times as fast as 0.19 on my GTX 555. Jobs that previously took 100 minutes to complete now finish in just a little over half an hour. But even more surprising is that the average rate skyrocketed from 100M/s to around 933M/s.

I know the number of candidates per second doesn't matter, but the figures I'm getting are quite... shocking. Is this normal?
so you manage the edit the mfaktc.ini but did you read it?
Code:
# Keep in mind that "number of candidates (M/G)" and "rate (M/s)" are NOT
# compareable between CPU- and GPU-sieving. When sieving is done on GPU
# those number count all factor candidates prior to sieving while CPU
# sieving counts the numbers after the sieving process.
#
TheJudger is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 07:43.


Mon Aug 2 07:43:27 UTC 2021 up 10 days, 2:12, 0 users, load averages: 1.38, 1.36, 1.36

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.