mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
 
Thread Tools
Old 2021-06-18, 15:27   #12
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

3·1,933 Posts
Default

Or as I do, get a block of ~30 manual assignments for ~18 hours of a Radeon VII GPU, check the GPU72 bounds at mersenne.ca for the highest exponent in the block, and use those bounds on all of them. PrimeNet will accept those as sufficient and not reissue the P-1 work. They're a bit of overkill but not too bad, not as much as gpuowl default. Checking the highest guards against the slight chance that the recommended or required bounds change mid-block and the higher exponents might not get sufficient bounds applied to retire the P-1 task when reported.
Occasionally (~annually?) the analysis of what are optimal bounds gets revised and so do the posted bounds there.
Quote:
Originally Posted by themoon123 View Post
I got these numbers on an ASUS TUF OC 6800 XT with factory overclocking. The exponent i'm testing is 111482183.

So ideally i would get the bounds i should use from here right? https://www.mersenne.ca/exponent/111482183 under GPU72 bounds?
That's what I would do. It's not only the GPU's time that needs optimizing; ours does too. Something to remember is that most optimals are found where partial differentials go through zero, and near there the slopes are small, so getting the exact optimal independent variables (B1, B2) versus a little more or less has little influence on the total computing cost we're trying to minimize.

So, looking up in gpuowl-v7.2-53's help output, that would be a 6M fft.
Run time scales as ~p2.1, so iteration time as ~p1.1, but it's "stairstepped", with step changes at fft length changes.

Which subversions of gpuowl v7.2 and v6.11 did you time on your RX6800XT? There are about 70 commits of v7.2, and several posted Windows builds in https://mersenneforum.org/showthread.php?t=25624; 380 commits of v6.11 and ~37 posted Windows builds, and the timing differences can be considerable on the same hardware and inputs.

Last fiddled with by kriesel on 2021-06-18 at 16:00
kriesel is online now   Reply With Quote
Old 2021-06-19, 10:26   #13
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

3×1,933 Posts
Default

Also, please run and submit benchmarks for your GPU as described near the top of https://www.mersenne.ca/cudalucas.php which is particularly in need of RX 6xxx benchmarks.

Last fiddled with by kriesel on 2021-06-19 at 10:27
kriesel is online now   Reply With Quote
Old 2021-06-26, 10:47   #14
drkirkby
 
"David Kirkby"
Jan 2021
Althorne, Essex, UK

6778 Posts
Default

Somewhat related, although I am using mprime, is it possible, and more to the point sensible, to do stage 2 of P-1, without doing stage 1, given someone else has done stage 1?

I have been assigned M103770473, which has been trial-factored to 2^76, then P-1 done with both B1 and B2 = 710 000, which I believe means stage 1 has been completed, but not stage 2. The entry in worktodo.txt indicates no work will be saved doing P-1 factoring, but I'm wondering if that is really true, given only the first stage has been done.

I just looked at the time to do P-1 on a similar sized exponent on my computer and see stage 1 takes about 30 minutes and stage 2 around an hour, so stage 2 is the more time-consuming stage.

Given the particular circumstances of this exponent and my computer

* Linux running mprime.
* The exponent is trial-factored to 2^76
* Stage 1 of P-1 done to 710 000 by someone else.
* Stage 2 not done.
* My machine takes about twice as long to do stage 2 as stage 1

what's the best thing to do?

I should be commencing this exponent in less than 2 hours, but I might move it further down the queue, to give more people chance to answer, as I'm aware many Americans will be asleep at moment in time.

Last fiddled with by drkirkby on 2021-06-26 at 11:00
drkirkby is offline   Reply With Quote
Old 2021-06-26, 11:38   #15
drkirkby
 
"David Kirkby"
Jan 2021
Althorne, Essex, UK

1BF16 Posts
Default

Okay, I found a partial answer from undoc.txt. This will skip stage 1 in mprime
Code:
Stage1GCD=0
I'm still not sure what is the most sensible course of action though. Maybe set B1=710 000, B2 around 30 million, and skip stage 1? I think this would achieve that. Please correct me if I am wrong.
Code:
PRP=AID,2,103770473,-1,710000,30000000,76,1
I have moved the exponent down the queue, and will wait for any answers before deciding what to do.

According to
https://www.mersenne.ca/prob.php?exp...00&b2=30000000
those values for B1 and B2 would have a 4.13% chance of finding a factor, and use 13.3 GHz days. Given I don't need to do P-1, I suspect the time would be about two thirds or that, or 8.9 GHz days. I don't suppose that those values would be optimal for finding a prime, but they might be reasonable.

Dave

Last fiddled with by drkirkby on 2021-06-26 at 12:19
drkirkby is offline   Reply With Quote
Old 2021-06-26, 12:26   #16
axn
 
axn's Avatar
 
Jun 2003

19·271 Posts
Default

You cannot continue the P-1 from the previous stage 1, since you don't have the result of the stage 1. And without the result of stage 1, you can't meaningfully do stage 2. So there's no "skipping" Stage 1. You have to do both from scratch.

Is it worth it?

710k, 30m @ 76 bit has a probability of 4.1%

But since we already know that previous stage 1 didn't find any factor, the incremental probability is 4.1-1.5 = 2.6% (1.5% is the probability with just 710k @ 76).

Statistically, it is probably not worth it. But, OTOH, it is just 1.5 hours of your life -- if you can live with it, go for it.

BTW, Stage1GCD=0 will skip the gcd, but the gcd cost is just a few seconds, so you're not saving anything. And as explained above, I think you're working under the incorrect assumption that you could somehow skip stage 1; you can't, so this stage1gcd is a bit of a red herring.

BTW 2 - You don't necessarily have to use the previous B1. You could use a different value (lower or higher). So if you decide to go ahead with P-1, ask P95 to generate optimal bounds for saving 1 test and go with it.
axn is offline   Reply With Quote
Old 2021-06-26, 14:18   #17
drkirkby
 
"David Kirkby"
Jan 2021
Althorne, Essex, UK

3×149 Posts
Default

Quote:
Originally Posted by axn View Post
Is it worth it?

710k, 30m @ 76 bit has a probability of 4.1%

But since we already know that previous stage 1 didn't find any factor, the incremental probability is 4.1-1.5 = 2.6% (1.5% is the probability with just 710k @ 76).

Statistically, it is probably not worth it. But, OTOH, it is just 1.5 hours of your life -- if you can live with it, go for it.
Thank you axn. Reading
https://www.mersenne.org/various/math.php#p-1_factoring
it was not apparent to me that the results from stage 1 were needed for stage 2. However, given they are, and the probabilities you calculate, I will skip the P-1, and just go straight for the PRP test.
drkirkby is offline   Reply With Quote
Old 2021-06-26, 15:08   #18
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

17×19×31 Posts
Default

Quote:
Originally Posted by drkirkby View Post
I should be commencing this exponent in less than 2 hours, but I might move it further down the queue, to give more people chance to answer, as I'm aware many Americans will be asleep at moment in time.
There are a number of highly knowledgeable forum members from: eastern Asia, "parts unknown" (but they tend to be up when it is day time in Bali), India (IIRC), various parts of mainland Europe, the UK, and even South America. Some of these are even your friendly moderators. And even a Canadian or 3. You could potentially get an answer at any hour.

Since this is a low Cat 0, we know that this was not a manual assignment. But, you are deciding to mess with the defaults of how PrimeNet assigned it to you. If this was assigned to me and I decided to monkey around with the assignment, and I had the kit, I would start the PRP on 1 machine (moving it ahead of the other items in the queue) and do P-1 on a second machine with bounds set to have a higher chance of finding a factor. But, generally I would just leave it alone to do its thing if it is not the last 1 or 2 before a milestone. One of my slower machines had a DC assignment go from a Cat 1 to a Cat 0. I just watched the progress via my workload page and let it do its thing. The machine is in a multiuser environment and the power goes out at the facility several times a year. So that is why I watch it more than the average person would have to the average machine.
Uncwilly is offline   Reply With Quote
Old 2021-06-26, 16:18   #19
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

16A716 Posts
Default

Quote:
Originally Posted by drkirkby View Post
Thank you axn. Reading
https://www.mersenne.org/various/math.php#p-1_factoring
it was not apparent to me that the results from stage 1 were needed for stage 2.
Note how aM is computed for stage 1 (mod Mp in our usage), and needed and used in stage 2. The unavailability to user B of user A's P-1 computation files is part of why bounds extension from previous runs is rarely done. See also https://www.mersenneforum.org/showpo...9&postcount=16
Please use the reference info (or in this case even a web search or wikipedia!); read, learn, understand. It's a never ending process.
Beware though of the wikipedia suggestion to use a=2. For Mersenne numbers, that's very bad advice.

There's no substantial obstacle to running P-1 on an exponent on one worker, and PRP/GEC/proof, on the same exponent, on another worker, even on the same system, same prime95 or mprime instance, simultaneously. It speeds completion overall, by running in parallel, while gambling a bit of computing time may be lost. (The PRP progress or completion is pointless if P-1 finds a factor.)

For kicks, I put PFactor=0,1,2,103770473,-1,76,2 on a gpuowl run on Radeon VII GPU. It will have the default 1M, 30M bounds. Stage 1 will be done in 21 minutes (from 11:23:30 US CDT). Stage 2 if needed I estimate at 21 minutes also.


Please, everyone, adjust your prime95 P-1 stage 2 memory limits upward from default if practical. The default 0.3GB prevents stage 2, and stage1-only is inefficient. Using adequate bounds the first time is the most efficient.

Last fiddled with by kriesel on 2021-06-26 at 16:39
kriesel is online now   Reply With Quote
Old 2021-06-26, 19:34   #20
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

22×5×11×53 Posts
Default

Quote:
Originally Posted by themoon123 View Post
Ah.. Thank you! The assignment i got has a 2 at the end, what does that mean?
It means you don't need to do a thing - nonzero means p-1 will get done automatically before PRP.
ewmayer is online now   Reply With Quote
Old 2021-06-26, 20:50   #21
drkirkby
 
"David Kirkby"
Jan 2021
Althorne, Essex, UK

3·149 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
There are a number of highly knowledgeable forum members from: eastern Asia, "parts unknown" (but they tend to be up when it is day time in Bali), India (IIRC), various parts of mainland Europe, the UK, and even South America. Some of these are even your friendly moderators. And even a Canadian or 3. You could potentially get an answer at any hour.
Yes, but Canada, USA and South America on similar(ish) timezones. On mailing lists I have used, and a chess server where I play chess, I tend to find activity is lower when those timezones are asleep.
Quote:
Originally Posted by Uncwilly View Post
Since this is a low Cat 0, we know that this was not a manual assignment. But, you are deciding to mess with the defaults of how PrimeNet assigned it to you. If this was assigned to me and I decided to monkey around with the assignment, and I had the kit, I would start the PRP on 1 machine (moving it ahead of the other items in the queue) and do P-1 on a second machine with bounds set to have a higher chance of finding a factor. But, generally I would just leave it alone to do its thing if it is not the last 1 or 2 before a milestone.
I've only got one decent(ish) machine. Next best is a quad-core 3.33 GHz Xeon, but the CPU is about 15 years old, so I don't think worth doing much with. It has a fair amount of RAM (24 GB), so I guess P-1 would have been practical, but kriesel kindly done that.

The position in the queue only got swapped for another category 0 exponent assigned the same day.

BTW, I completed a triple check of one of the LL results you listed in another thread - M57906071. No less than 7 people got assigned LL tests, with 3 results, but at least two of them now agree.
drkirkby is offline   Reply With Quote
Old 2021-06-26, 21:43   #22
drkirkby
 
"David Kirkby"
Jan 2021
Althorne, Essex, UK

3×149 Posts
Default

Quote:
Originally Posted by kriesel View Post
For kicks, I put PFactor=0,1,2,103770473,-1,76,2 on a gpuowl run on Radeon VII GPU. It will have the default 1M, 30M bounds. Stage 1 will be done in 21 minutes (from 11:23:30 US CDT). Stage 2 if needed I estimate at 21 minutes also.
Thank you - I feel a bit happier I'm not going to waste a couple of days now. A few sources

https://www.neowin.net/news/gamers-r...tting-cheaper/
https://www.fudzilla.com/news/graphi...e-winding-down

are now indicating the prices of GPUs have peaked, and are falling, so I might look for a Radeon VII GPU soon. Although one site I looked at indicated the top-end cards have fallen 10-15%, whereas low/mid end cards 50%. So maybe I will wait a bit longer. I could probably get comparable performance sticking in the most expensive CPUs my workstation will take, but that would cost one hell of a lot more than a Radeon VII or two, even at todays prices.
Quote:
Originally Posted by kriesel View Post
Please, everyone, adjust your prime95 P-1 stage 2 memory limits upward from default if practical. The default 0.3GB prevents stage 2, and stage1-only is inefficient. Using adequate bounds the first time is the most efficient.
I guess it is a tough call, as George does not want to release software that dramatically slows peoples machines. But I do wonder if the default RAM should be increased if the computer has a reasonable amount of RAM - say 8 GB or more. I could do P-1 on an Amazon instance with 1 GB RAM, so maybe the default could be raised a bit, especially if the machine has lots of RAM. Many people will no doubt leave the program running with the default settings.
drkirkby is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Best type of work for lowest exponent work? cappy95833 Software 5 2019-11-21 04:24
Work Assignment Question Fred PrimeNet 4 2016-04-23 21:47
How can one prevent assignment of new work? Dorian PrimeNet 13 2012-04-11 21:52
How to calculate work/effort for PRP work? James Heinrich PrimeNet 0 2011-06-28 19:29
Old Assignment Primeinator PrimeNet 4 2009-06-04 22:35

All times are UTC. The time now is 21:55.


Sat Oct 23 21:55:23 UTC 2021 up 92 days, 16:24, 0 users, load averages: 1.18, 1.14, 1.12

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.