mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU to 72 (https://www.mersenneforum.org/forumdisplay.php?f=95)
-   -   GPU to 72 status... (https://www.mersenneforum.org/showthread.php?t=16263)

kracker 2014-05-19 16:19

[QUOTE=LaurV;373807]In an "order by fire power", I see a "[URL="http://www.gpu72.com/reports/worker/c87ea45fa5e7920bc6d4a431af1c3698/"]NickOfTime[/URL]" coming very strong from the back. With about 2THzD/D, that sounds like 4 Titans or a lot of AMD fire power. With ~29G/factor, it means he is doing the last bit, for which the 39 factors from 46 expected sounds statistically good. So, he is not cheating, he [U]has[/U] a lot of fire power.

So.. Who is he? Same "Nick" from the forum, or new meat? :razz:
Congrats anyhow, and kotgw![/QUOTE]

I noticed yesterday... Hmm.
[url]http://www.gpu72.com/reports/workers/week/[/url]

EDIT: Titan's do 500 GHz?

LaurV 2014-05-19 18:02

[QUOTE=kracker;373808]EDIT: Titan's do 500 GHz?[/QUOTE]
The unlocked Titan (the one which you can enable/disable the DP from the Nvidia Control Panel) can do ~520GD/D with DP disabled and a bit of OC (only a bit, didn't try more, I bought the one from Xyzzy, which is still air cooled, I am planing to buy the water block, but no budget/time yet).

chalsall 2014-05-19 18:08

[QUOTE=LaurV;373807]So.. Who is he? Same "Nick" from the forum, or new meat? :razz:
Congrats anyhow, and kotgw![/QUOTE]

I can't speak to that. Well, actually I can, but I won't.

I thank him for his contributions. Seems very honest.

kracker 2014-05-19 18:55

Well... statistically we seem to be [URL="https://www.gpu72.com/graphs/ghzdays/halfyear/"]doing well[/URL] :smile:

manfred4 2014-05-19 19:32

[QUOTE=chalsall;373774]Good catch! Thanks.

I've commented out the "... and Exponent<70000000" constraint in the SQL query. (GPU72 has been running for a lot longer than I ever imaged -- it was originally envisioned to be short-term sub-project.)[/QUOTE]

Did you implement that one yet? If you did, it seems to be not working as intended now, the 70M+ Assignments are not showing up yet (on [URL="http://en.gpu72.com/reports/factoring_cost/"]this[/URL] one neither).

chalsall 2014-05-19 19:51

Amazon EC2...
 
You are now connected to Mark from Amazon.com
Me:I would like to "chat" with a human...
Mark:Hi. Thanks for contacting Amazon
My name is Mark.. and I am a human!
Tell me, how can I help you today?
Me:Thanks Mark. OK, I've received several emails saying that my AWS service is about to be suspended because of an unpaided bill of $1.28.
Mark:A member of our AWS team will need to help you with this; however, they don't have chat support. Would you like me to send them an e-mail so they can get back to you within 1-2 business days?
Or you can call us by phone and request to be transferred to AWS support
Me:What, exactly, is the phone number I should call? You guys appear to make sure no customer has the phone number.
Mark:+1 866 216 1072
and tell the agent that you want to be transferred to Amazon Web Service
Me:Thanks Mark. However, dialing that number from Barbados results in "That number cannot be connected to".
Do you have a non-toll-free number I can use?
Mark:Oh, ok.. let me get a different one
+1 206 266 2992
Did that one work?
Me:OK, thanks Mark. I'm currently interacting with a human who can barely speak English... I've asked to speak to her supervisor, and am now on hold (and listening to classical music...).
Mark:Ok
Is there anything else I can help you with?
Thanks for contacting Amazon. Have a good day!
Me:Actually, there is. Why can I not access the AWS people who sent me the "scary" emails?
Mark:Humm I am not sure.
They are a completely different department
Me:OK. I appreciate this.
Mark from Amazon.com has left the conversation.

chalsall 2014-05-19 20:32

[QUOTE=manfred4;373820]Did you implement that one yet? If you did, it seems to be not working as intended now, the 70M+ Assignments are not showing up yet (on [URL="http://en.gpu72.com/reports/factoring_cost/"]this[/URL] one neither).[/QUOTE]

Unlike some, I very much appreciate being challenged.

Please let me know if what you expect to see is not what you see.

NickOfTime 2014-05-20 16:51

[QUOTE=LaurV;373807]In an &quot;order by fire power&quot;, I see a &quot;[URL="http://www.gpu72.com/reports/worker/c87ea45fa5e7920bc6d4a431af1c3698/"]NickOfTime[/URL]&quot; coming very strong from the back. With about 2THzD/D, that sounds like 4 Titans or a lot of AMD fire power. With ~29G/factor, it means he is doing the last bit, for which the 39 factors from 46 expected sounds statistically good. So, he is not cheating, he [U]has[/U] a lot of fire power.

So.. Who is he? Same &quot;Nick&quot; from the forum, or new meat? :razz:
Congrats anyhow, and kotgw![/QUOTE]

Just a few gpu's :-) 2 AMD 290x, amd 260x, gtx 690, gtx 660ti, gtx 560ti

kracker 2014-05-20 17:34

[QUOTE=NickOfTime;373874]Just a few gpu's :-) 2 AMD 290x, amd 260x, gtx 690, gtx 660ti, gtx 560ti[/QUOTE]

Wow, very nice. Mind if you tell me how much the 260X cranks out? :smile:

NickOfTime 2014-05-20 17:59

[QUOTE=kracker;373879]Wow, very nice. Mind if you tell me how much the 260X cranks out? :smile:[/QUOTE]

160ghz/d at bit level 73-74 about 200ghz/d below that..

James Heinrich 2014-05-20 18:19

[QUOTE=NickOfTime;373881]160ghz/d at bit level 73-74[/QUOTE]That seems unexpectedly lower than the 212GHd/d [url=http://www.mersenne.ca/mfaktc.php]my chart[/url] predicts. Would you mind filling out the [url=http://www.mersenne.ca/mfaktc.php#benchmark]mfakto benchmark form[/url] at the top of the page with data from your two AMD GPUs? I have pretty good data for NVIDIA but not so much from AMD, especially the R[i]x[/i] series.

edit: since you edited your post to clarify, would you mind sending me benchmarks in both ranges?

kracker 2014-05-20 18:25

[QUOTE=NickOfTime;373881]160ghz/d at bit level 73-74 about 200ghz/d below that..[/QUOTE]

Hmm, my 7770 at 73-74 does 130 GHz/d. 7770 has 640 SP, 260X has 896. Also the 260X has a core clock of 1100 MHz, while 7770 has 1000.

kracker 2014-05-20 18:29

@James Heinrich: I think we should have a standard bitrange-exponent for testing, as speeds vary a lot. etc the DC range is "faster" than the LL range.

chalsall 2014-05-20 18:41

[QUOTE=kracker;373885]@James Heinrich: I think we should have a standard bitrange-exponent for testing, as speeds vary a lot. etc the DC range is "faster" than the LL range.[/QUOTE]

Further to this... I understand that AMD (and possibly nVidia) cards have "sweet spots". As in, for example, going to 73 might be more efficient for certain cards than going to 74.

If this can be determined and defined, then perhaps GPU72 should expand "What makes sense" to also include "What makes sense for ${Class} card" options.

Thoughts?

James Heinrich 2014-05-20 18:59

[QUOTE=kracker;373885]@James Heinrich: I think we should have a standard bitrange-exponent for testing, as speeds vary a lot. etc the DC range is "faster" than the LL range.[/QUOTE]That would give you good data on that particular exponent, but randomized input actually gives a better overall picture of these speed variations. What would be brilliant would be if Oliver/Bertram could include a broad benchmark that runs a few classes for a range of exponents (every 1M, 5M, etc across the range specified [e.g. 30M-80M]) and for each test at various bit ranges and give throughput performance at that exponent+bitlevel. That would provide consistent data to map the 3D performance variance for the various GPUs. More data than I currently want to analyze, but could be interesting.

kracker 2014-05-20 19:05

60M on a HD 7770:

70-71: 153 GHz
71-72: 154 GHz
72-73: 154 GHz
73-74: 132 GHz

35M on same:

68-69: 188
69-70: 178
70-71: 160
71-72: 160

I'm curious if mfaktc is more "smooth".

chalsall 2014-05-20 19:25

[QUOTE=James Heinrich;373890]That would provide consistent data to map the 3D performance variance for the various GPUs. More data than I currently want to analyze, but could be interesting.[/QUOTE]

I would be very interested in having access to that kind of data -- to analyze in a 3 or 4 dimensional space.

As you know, I don't have privileged access to Primenet. But I understand that Primenet records (or, at least, is told) what client did what work.

If this knowledge could be exposed to those interested, it could be quite valuable.

NickOfTime 2014-05-20 19:28

[QUOTE=kracker;373891]60M on a HD 7770:

70-71: 153 GHz
71-72: 154 GHz
72-73: 154 GHz
73-74: 132 GHz

35M on same:

68-69: 188
69-70: 178
70-71: 160
71-72: 160

I'm curious if mfaktc is more &quot;smooth&quot;.[/QUOTE]

Well, with mfakto, it switches from barrett15_73_gs to barrett15_82_gs where mfaktc is using barrett76_mul32

kracker 2014-05-20 19:40

[QUOTE=NickOfTime;373893]Well, with mfakto, it switches from barrett15_73_gs to barrett15_82_gs where mfaktc is using barrett76_mul32[/QUOTE]

[URL="https://github.com/Bdot42/mfakto/blob/master/src/mfakto.cpp"]If interested...[/URL]

NickOfTime 2014-05-20 20:03

[QUOTE=kracker;373894][URL="https://github.com/Bdot42/mfakto/blob/master/src/mfakto.cpp"]If interested...[/URL][/QUOTE]

Hmm, there is a BARRETT76_MUL32_GS. The only obvious difference is that it has stages 1 flag. Checked my ini and stages=1, maybe something about GCN is disabling it or some other bug....

Nope 76 is Mul32 where 82 is mul15 in find_fastest_kernel
[CODE]/* GPU_GCN (7850@1050MHz, v=2) / (7770@1100MHz)*/
BARRETT69_MUL15, // "cl_barrett15_69" (393.88 M/s) / (259.96 M/s)
BARRETT70_MUL15, // "cl_barrett15_70" (393.47 M/s) / (259.69 M/s)
BARRETT71_MUL15, // "cl_barrett15_71" (365.89 M/s) / (241.50 M/s)
BARRETT73_MUL15, // "cl_barrett15_73" (322.45 M/s) / (212.96 M/s)
BARRETT82_MUL15, // "cl_barrett15_82" (285.47 M/s) / (188.74 M/s)
BARRETT76_MUL32, // "cl_barrett32_76" (282.95 M/s) / (186.72 M/s)
BARRETT77_MUL32, // "cl_barrett32_77" (274.09 M/s) / (180.93 M/s)[/CODE]

VictordeHolland 2014-05-20 22:47

My HD7950 @900MHz is also more 'efficient' in the DC TF range.

mfakto v.014

35M
69-70 [cl_barrett15_71_gs_2] [B]420GHz-d[/B]
70-71 [cl_barrett15_73_gs_2] 380GHz-d

69M
71-72 [cl_barrett15_73_gs_2] 366GHz-d
72-73 [cl_barrett15_73_gs_2] 366GHz-d
73-74 [cl_barrett15_82_gs_2] 327GHz-d

chalsall 2014-05-21 00:29

[QUOTE=VictordeHolland;373902]My HD7950 @900MHz is also more 'efficient' in the DC TF range.[/QUOTE]

Then do everything you can in the DC range to 70.

Others will finish the exponents and release them for DCing.

LaurV 2014-05-21 03:30

[QUOTE=James Heinrich;373882]That seems unexpectedly lower than the 212GHd/d [URL="http://www.mersenne.ca/mfaktc.php"]my chart[/URL] predicts. [/QUOTE]
Not really, as we commented/discussed before, mfakto (AMD/OpenCL (?!?)) is known for getting lazy at higher bit levels. See my former posts about the subject. Now I can prove that it come from the (barrett? monty?) kernels which are better taking advantage of architecture, for lower bit levels.

For example my 7970 crunches 630G at ~40M to 69, but it gets as low as 400G at ~65M to 74. The best use (optimum point) for this card is either TF to ~70/71 bits, or DC of a ~37M exponent (where a power of 2 FFT is used optimally).

Bdot 2014-05-21 09:30

There are 3 factors that influence mfakto (and mfaktc) performance:
[LIST][*]most important: the kernel being used (selected only by target bitlevel)
Different algorithms / data chunk sizes have different effects ... For mfaktc you can see the effect when going beyond 76 bits, then it will also switch kernels.[*]measurable: size of the exponent in bits
For each bit, the exponentiation/modulo loop needs to be run once. The first ~6 bits are for free, and there is some one-time overhead, so the effect is not proportional, but that is why the same bit-level in the DC-range is faster than in the LL-range.[*]negligible: the number of '1's vs. '0's in the exponent (in binary)
For every '1' a small step needs to be done in addition. I think this is only measurable if you have no other 'noise' impacting the speed.[/LIST]On AMD H/W, the 32-bit kernel have quite some penalty because 32-bit muls are executed by the DP unit, so they have the same SP/DP performance ratio (1:16 on low and mid-level H/W, 1:4 on high end). In addition, the carry flag is not usable in OpenCL and needs extra mimic to get it. Therefore, 15-bit kernels were my fastest implementation, utilizing fast 24-bit multiplications and having room for the carry flag.

kracker 2014-05-21 17:06

[QUOTE=LaurV;373916]Not really, as we commented/discussed before, mfakto (AMD/OpenCL (?!?)) is known for getting lazy at higher bit levels. See my former posts about the subject. Now I can prove that it come from the (barrett? monty?) kernels which are better taking advantage of architecture, for lower bit levels.

For example my 7970 crunches 630G at ~40M to 69, but it gets as low as 400G at ~65M to 74. The best use (optimum point) for this card is either TF to ~70/71 bits, or DC of a ~37M exponent (where a power of 2 FFT is used optimally).[/QUOTE]

I think anything below 73 bits is fine.

chalsall 2014-05-21 17:17

[QUOTE=kracker;373938]I think anything below 73 bits is fine.[/QUOTE]

OK. Can we think about and discuss this?

The whole point of GPU72 is to optimize the available GPU firepower.

I have been using James' analysis as to where the cross-over points should be (read: where TF'ing Makes More Sense than LL'ing or DC'ing).

I'm more than happy to add additional "WMS" options for different card types.

manfred4 2014-05-21 21:38

If you are collecting these tests now, I can participate: Just checked the stats for my cards on mfactc 0.20:

GTX670@1176MHz:
[CODE]Exp toBit Ghzd/d
66M 74 275.8
66M 73 275.9
66M 72 276.2
66M 71 276.2

35M 71 284.8
35M 70 284.7
35M 69 284.6[/CODE]

GTX460M@675MHz

[CODE]66M 74 96.5
66M 73 96.4
66M 72 96.4
66M 71 96.4

35M 71 100.2
35M 70 100.2
35M 69 100.1[/CODE]

seems to be a lot smoother between the exponents and bitlevels.

James Heinrich 2014-05-21 21:54

[QUOTE=Bdot;373923]There are 3 factors that influence mfakto (and mfaktc) performance[/QUOTE][QUOTE=James Heinrich;373890]What would be brilliant would be if Oliver/Bertram could include a broad benchmark that runs a few classes for a range of exponents (every 1M, 5M, etc across the range specified [e.g. 30M-80M]) and for each test at various bit ranges and give throughput performance at that exponent+bitlevel. That would provide consistent data to map the 3D performance variance for the various GPUs.[/QUOTE]@Bdot: how hard would it be to implement a benchmark such as I suggest?

LaurV 2014-05-22 03:05

[QUOTE=manfred4;373967]seems to be a lot smoother between the exponents and bitlevels.[/QUOTE]
Yes, it is. As explained few posts above, mfakt[B][U]c[/U][/B] uses (almost?) the same kernel (barrett76?) for all this stuff. The "big drop" in performance you will feel only for very short assignments, or only for bitlevels over 76, when a less [URL="http://weblogs.asp.net/jgalloway/archive/2007/05/10/performant-isn-t-a-word.aspx"]performant [/URL]kernel will be used. Also, as discussed, mfakt[B][U]o[/U][/B] has "lower bitlevel" kernels, optimized to fit the AMD/OpenCL architecture (see Bdot's posts).

NickOfTime 2014-05-22 06:31

Hmm, or how much work to create a BARRETT76_MUL15 kernel? :ermm:

kracker 2014-05-22 14:01

[QUOTE=NickOfTime;373985]Hmm, or how much work to create a BARRETT76_MUL15 kernel? :ermm:[/QUOTE]

Well, I think the question is: How fast or efficient would it be?

Bdot 2014-05-22 16:41

[QUOTE=kracker;373998]Well, I think the question is: How fast or efficient would it be?[/QUOTE]
:grin:
It would be exactly as fast as the 82_MUL15 kernel, because it would need to be implemented like it.

When using 32-bit chunks of data, all current kernels need 3 of them, giving 96 bits. Now, certain short-cuts are possible that reduce the available bits. It's basically using less exact intermediate values during the calculation that are cheaper to compute, like skipping evaluation of some carry flags. Different short-cuts have different costs, but adding them all in brings you down to 76 bits usable out of the 96.

Now, when using 15-bit chunks, you can use 5 of them for 75 "raw" bits, or 6 chunks for 90. Adding in all short-cuts results in 69 and 82 usable bits, respectively. 73 bits is the full implementation with 5 chunks (no short-cuts; there are always small rounding errors that eat one or two bits).

It might be worth checking again, if I can squeeze out 74 bits in 5 chunks - I currently don't remember why I did not succeed the last time ...

NickOfTime 2014-05-23 18:56

[QUOTE=Bdot;374005]:grin:
It would be exactly as fast as the 82_MUL15 kernel, because it would need to be implemented like it.

When using 32-bit chunks of data, all current kernels need 3 of them, giving 96 bits. Now, certain short-cuts are possible that reduce the available bits. It's basically using less exact intermediate values during the calculation that are cheaper to compute, like skipping evaluation of some carry flags. Different short-cuts have different costs, but adding them all in brings you down to 76 bits usable out of the 96.

Now, when using 15-bit chunks, you can use 5 of them for 75 &quot;raw&quot; bits, or 6 chunks for 90. Adding in all short-cuts results in 69 and 82 usable bits, respectively. 73 bits is the full implementation with 5 chunks (no short-cuts; there are always small rounding errors that eat one or two bits).

It might be worth checking again, if I can squeeze out 74 bits in 5 chunks - I currently don't remember why I did not succeed the last time ...[/QUOTE]

Hmm, I guess it depends on how many to 74 exponents are left and how long we will be processing them :-). I seem to be mostly processing 73-74's in the 65/66M range at the moment...

James Heinrich 2014-05-23 20:11

[QUOTE=NickOfTime;374106]Hmm, I guess it depends on how many to 74 exponents are left and how long we will be processing them :-)[/QUOTE]Many, and a long time.
As of 01-May-2014 "many" was approximately 21,176,383 exponents above 65M (for requiring 2[sup]74[/sup]) in PrimeNet range (below 1000M) that are currently TF'd to less than 2[sup]74[/sup] and will eventually need to be taken there. I didn't bother to calculate the THz-years required, but it'll be a bunch.

[SIZE="1"]Small trivia: if we continue TF limits in the current curve, TF for the range between 1000M-4294M will require 1.5 EHz-days (exahertz-days, as in thousand-million GHz-days. That means 1000 TitanBlack/780ti GPUs running continuously for 1000 years.)[/SIZE]

Bdot 2014-05-23 23:16

[QUOTE=James Heinrich;373970]@Bdot: how hard would it be to implement a benchmark such as I suggest?[/QUOTE]
I was extending the --perftest mode of mfakto over the last versions, but so far it is mainly testing the sieving performance, in order to find the best config values.

Doing the performance tests for each kernel is on the list ... Oliver and I discussed that a while ago, in order to have some comparable results. we need to revive that.


And my attempt for a 74_15 kernel comes in less than 1% ahead of the 82_15 kernel, but still misses some factors :bangheadonwall: I will need to use an even more accurate modulo function that will slow down the kernel even more ...

chalsall 2014-05-24 21:10

[QUOTE=Bdot;374132]Doing the performance tests for each kernel is on the list ... Oliver and I discussed that a while ago, in order to have some comparable results. we need to revive that.[/QUOTE]

That would be really cool. What would be even more cool is if such results could then be submitted to Primenet, and then made available to those interested. Perhaps James could help with that.

[QUOTE=Bdot;374132]And my attempt for a 74_15 kernel comes in less than 1% ahead of the 82_15 kernel, but still misses some factors :bangheadonwall: I will need to use an even more accurate modulo function that will slow down the kernel even more ...[/QUOTE]

Not meaning to blow inappropriate sunshine. But what you and Oliver (et al) do is (IMO) quite impressive.

James Heinrich 2014-05-24 21:59

[QUOTE=chalsall;374192]What would be even more cool is if such results could then be submitted to Primenet, and then made available to those interested. Perhaps James could help with that.[/QUOTE]I don't know about on PrimeNet since I'm not all that comfortable with database interactions there, but I'd be happy to make such data available in raw and aggregated form on mersenne.ca

chalsall 2014-05-24 22:07

[QUOTE=James Heinrich;374203]I don't know about on PrimeNet since I'm not all that comfortable with database interactions there, but I'd be happy to make such data available in raw and aggregated form on mersenne.ca[/QUOTE]

LOL... If you could "make it so", is would be appreciated and useful.

kracker 2014-05-24 22:34

[QUOTE=chalsall;374192]
Not meaning to blow inappropriate sunshine. But what you and Oliver (et al) do is (IMO) quite impressive.[/QUOTE]

+1 I agree.

[QUOTE=James Heinrich;374203]I don't know about on PrimeNet since I'm not all that comfortable with database interactions there, but I'd be happy to make such data available in raw and aggregated form on mersenne.ca[/QUOTE]

Maybe you could get a list of exponents with factors "close to the beginning" so to say and make a list? Then you could run the list which would take a little of time, outputting the console to text? Just a (possibly stupid) idea.

James Heinrich 2014-05-24 23:14

[QUOTE=chalsall;374204]LOL... If you could "make it so", is would be appreciated and useful.[/QUOTE]I can and will, once there is data to play with. Well, I guess I can start working on it but it would be easiest if I had some idea what eventual form the proposed new benchmark mode in mfakt_ would take.

[QUOTE=kracker;374210]Maybe you could get a list of exponents with factors "close to the beginning" so to say and make a list? Then you could run the list which would take a little of time, outputting the console to text? Just a (possibly stupid) idea.[/QUOTE]I'm not sure I follow? :unsure:
[i](thinks about it some more)[/i]
Oh, you mean a list of exponents with factors that would be found in the first (few) class(es) in mfakt_ and use that as my own benchmark profile. Hmm. I should be able to do that. Actually about a year ago I dredged up a list of a few thousand (or maybe it was million) exponents with easy-to-find factors for Bdot. I'll have to dig up that list, or better-yet, the code used to generate that list. For the purpose of analyzing throughput it would probably be best to [i]not[/i] have a factor in the first class, but maybe 2-3 classes in. In the "real" benchmark the presence of a factor wouldn't be required, the exponent could simply be aborted after running for a few classes (whatever is sufficient to get a stable throughput estimate).

kracker 2014-05-25 00:38

[QUOTE=James Heinrich;374214]
I'm not sure I follow? :unsure:
[i](thinks about it some more)[/i]
Oh, you mean a list of exponents with factors that would be found in the first (few) class(es) in mfakt_ and use that as my own benchmark profile. Hmm. I should be able to do that. Actually about a year ago I dredged up a list of a few thousand (or maybe it was million) exponents with easy-to-find factors for Bdot. I'll have to dig up that list, or better-yet, the code used to generate that list. For the purpose of analyzing throughput it would probably be best to [i]not[/i] have a factor in the first class, but maybe 2-3 classes in. In the "real" benchmark the presence of a factor wouldn't be required, the exponent could simply be aborted after running for a few classes (whatever is sufficient to get a stable throughput estimate).[/QUOTE]

Yes... that was exactly what I was thinking. :smile:

Bdot did add a lot of self-tests for -st2 in 0.12, see [URL="https://github.com/Bdot42/mfakto/commit/60f984af31dd1430a6cd640c50887d520f11ef87"]this[/URL] and [URL="https://raw.githubusercontent.com/Bdot42/mfakto/master/src/selftest-data.h"]this[/URL] :razz:

James Heinrich 2014-05-25 02:51

[QUOTE=kracker;374218]Bdot did add a lot of self-tests for -st2 in 0.12[/QUOTE]Ah, well, there you go then. He even blames me for them. :smile:
I guess it was closer to 2 years ago that I did that.

TheMawn 2014-05-25 17:54

I have returned. Grabbing some factoring jobs and back to crunching.

kracker 2014-05-25 18:19

[QUOTE=James Heinrich;374228]Ah, well, there you go then. He even blames me for them. :smile:
I guess it was closer to 2 years ago that I did that.[/QUOTE]

Well, mfaktc apparently added it as well. (0.18)
[code]
- new commandline option: "-st2" runs a even longer selftest with *new*
testcases
[/code]

James Heinrich 2014-05-30 13:04

[QUOTE=James Heinrich;374214]a list of exponents with factors that would be found in the first (few) class(es) in mfakt_[/QUOTE]I spent the last few days going through 100-million factors and tagging them with the mfaktc class (out of 4620) where the factor would be found. Actually calculating the class is trivial, took not more than 10 minutes. The other dozen hours was wrangling the database to make a new column and stuff the calculated value into 100+ million records. :smile:
The upside is that I can now easily pull together a list of factors that will be found in any particular class. Not entirely sure how this is useful, but I'll try making a list as [i]kracker[/i] suggested for benchmarking.

chalsall 2014-05-30 16:03

LLTF Category 4...
 
Just so everyone knows...

We are going to start having difficulty maintaining feeding the Category 4 "churners" with all candidates TF'ed to 74 again -- at least for a while.

Thus, I have instructed "Spidy" to stop reclaiming candidates already TFed to 73.

Additionally, MISFIT users who use the "Let GPU72 Decide" option are now being given assignments to 73 instead of 74. These will be kept by GPU72 as part of its "rip-cord", and only released if anything below 73 is about to be released by Primenet for LL'ing or P-1'ing (or, of course, if someone then takes them up to 74).

I am hoping this will only be needed for a month or so. And, as I said above, while ~1,000 candidates are requested per day in the Cat 4 range, only ~72 per day are actually completed. Or, in other words, we should be able to reclaim most of them again in the future, and TF appropriately (possibly once they enter the Cat 3 range).

As always, thanks to everyone contributing fire-power. And for any who can bring additional fire-power to bear, it would be welcome, needed and appreciated. :smile:

(BTW, we're looking good in the DCTF Cat 4 range. Pretty much steady-state with the fire-power we have working there currently.)

kladner 2014-05-30 17:36

I will load up with LLTF, then. I've done a few days of DCTF, but those will clear by this evening some time.

TheMawn 2014-05-30 18:33

[QUOTE=chalsall;374603]We are going to start having difficulty maintaining feeding the Category 4 "churners" with all candidates TF'ed to 74 again -- at least for a while.[/QUOTE]

What happened? Did we just fall behind?

kracker 2014-05-30 18:36

[QUOTE=TheMawn;374617]What happened? Did we just fall behind?[/QUOTE]

probably Oliver, we don't have his 20,000 GHz/days anymore :smile:

chalsall 2014-05-30 18:51

[QUOTE=TheMawn;374617]What happened? Did we just fall behind?[/QUOTE]

The issue is that many requests for Cat 4 never complete. Further, many of the requests are by way of "Manual Assignments", most of which are made by "Anonymous", and yet are valid for 180 days.

I earlier sent a PM to George suggesting he consider lowering the Cat 4 cut-off back down to 50,000 to deal with this. But the decision is his.

James Heinrich 2014-05-30 22:23

1 Attachment(s)
[QUOTE=James Heinrich;374581]I'll try making a list as [i]kracker[/i] suggested for benchmarking.[/QUOTE]Here's my first attempt. Contains the first-found factor (excluding factors found in class-0) for an exponent in (10M < exponent < 100M) with factor sizes between 2[sup]64[/sup] and 2[sup]80[/sup] (where available). There are 1289 entries. You can run the benchmark as[code]mfaktc-win-32.exe >> benchmark.txt[/code]Note with the double >> it will append to the output file so you can Ctrl-C to abort the benchmark and later resume it.
Benchmark should be run with no other GPU load present, obviously.
mfaktc.ini must have [i]PrintMode=0[/i] please.
Having mfaktc0.20-standard header config would be good for consistency:[code]# mfaktc 0.20 (default)
ProgressHeader=Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
ProgressFormat=%d %T | %C %p%% | %t %e | %g %s %W%%[/code]

As of yet I don't have anything useful to do with the data, but if anyone wants to run through the benchmark set and email me the results I'll see if I can massage the output into something resembling useful data.

TheMawn 2014-05-30 23:25

You have dozens of lines like

Factor=10409843,79,80

That benchmark will take [I]weeks[/I] to complete...

kracker 2014-05-30 23:45

[QUOTE=TheMawn;374646]You have dozens of lines like

Factor=10409843,79,80

That benchmark will take [I]weeks[/I] to complete...[/QUOTE]

Have you actually run it? :wink:

James Heinrich 2014-05-31 01:11

[QUOTE=TheMawn;374646]That benchmark will take [I]weeks[/I] to complete...[/QUOTE]I haven't actually run it through myself, but the ideas is that each exponent list [i]does[/i] have a factor, and where possible that factor will be found within the first few classes and so runtime per exponent should be mostly within the order of seconds-to-minutes. There are undoubtedly some worse-case instances where, due to having a poor selection of known factors to choose from, the factor won't be found until later in the run and each iteration taking considerable time. I haven't (yet) checked for expected runtime for each case. Perhaps that is something I should build into my benchmark-generator.

James Heinrich 2014-05-31 02:33

1 Attachment(s)
[QUOTE=James Heinrich;374636]Here's my first attempt.[/QUOTE]Here's my second version. Same basic file, but omitting any assignment that would take more than 1 GHz-day. That takes the count down from 1289 to 868 entries.

LaurV 2014-05-31 03:17

Pure stupid question, I just woke up and didn't have my coffee yet, but I don't get it why the class-0 has to be excluded (?!) Forgive my brain fart if any.

kracker 2014-05-31 03:19

[QUOTE=LaurV;374660]Pure stupid question, I just woke up and didn't have my coffee yet, but I don't get it why the class-0 has to be excluded (?!) Forgive my brain fart if any.[/QUOTE]

[URL="http://mersenneforum.org/showpost.php?p=374214&postcount=2950"]see:[/URL]

At least for my GPU, it is fastest after ~10 sec(warmup?), not the very beginning. :smile:

EDIT: even with PrintMode=1...

[code]
mfakto 0.14-MGW (64bit build)


Runtime options
Inifile mfakto.ini
Verbosity 1
SieveOnGPU yes
MoreClasses yes
GPUSievePrimes 88629
GPUSieveProcessSize 24Ki bits
GPUSieveSize 96Mi bits
FlushInterval 8
WorkFile worktodo.txt
ResultsFile results.txt
Checkpoints enabled
CheckpointDelay 300s
Stages enabled
StopAfterFactor class
PrintMode compact
V5UserID none
ComputerID none
TimeStampInResults no
VectorSize 2
GPUType AUTO
SmallExp no
UseBinfile mfakto_Kernels.elf
Compiletime options
Select device - Get device info - Loading binary kernel file mfakto_Kernels.elf
Compiling kernels.

OpenCL device info
name Capeverde (Advanced Micro Devices, Inc.)
device (driver) version OpenCL 1.2 AMD-APP (1526.3) (1526.3 (VM))
maximum threads per block 256
maximum threads per grid 16777216
number of multiprocessors 10 (640 compute elements)
clock rate 1020MHz

Automatic parameters
threads per grid 256
optimizing kernels for GCN

Started a simple selftest ...
######### testcase 1/17 (M60003157[58-59]) #########
######### testcase 2/17 (M60008017[58-59]) #########
######### testcase 3/17 (M60009827[58-59]) #########
######### testcase 4/17 (M50863909[69-70]) #########
######### testcase 5/17 (M51375383[69-70]) #########
######### testcase 6/17 (M51406301[70-71]) #########
######### testcase 7/17 (M47644171[70-71]) #########
######### testcase 8/17 (M51038681[71-72]) #########
######### testcase 9/17 (M49717271[71-72]) #########
######### testcase 10/17 (M50752613[72-73]) #########
######### testcase 11/17 (M50908933[72-73]) #########
######### testcase 12/17 (M53076719[73-74]) #########
######### testcase 13/17 (M53123843[74-75]) #########
######### testcase 14/17 (M60009109[34-35]) #########
######### testcase 15/17 (M60002273[34-35]) #########
######### testcase 16/17 (M60004333[63-64]) #########
######### testcase 17/17 (M3321928703[90-91]) #########
Selftest statistics
number of tests 108
successful tests 108

selftest PASSED!

got assignment: exp=10515217 bit_min=64 bit_max=65 (0.36 GHz-days)
Starting trial factoring M10515217 from 2^64 to 2^65 (0.36GHz-days)
Using GPU kernel "cl_barrett15_69_gs_2"
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
May 30 20:24 | 0 0.1% | 0.160 2m33s | 199.87 88629 0.00%
May 30 20:24 | 8 0.2% | 0.170 2m43s | 188.12 88629 0.00%
May 30 20:24 | 12 0.3% | 0.170 2m43s | 188.12 88629 0.00%
May 30 20:24 | 15 0.4% | 0.170 2m43s | 188.12 88629 0.00%
May 30 20:24 | 20 0.5% | 0.170 2m42s | 188.12 88629 0.00%

M10515217 has a factor: 30604828569994966241

found 1 factor for M10515217 from 2^64 to 2^65 (partially tested) [mfakto 0.14-MGW cl_barrett15_69_gs_2]
tf(): total time spent: 0.840s (36548.21 GHz-days / day)

got assignment: exp=10720453 bit_min=65 bit_max=66 (0.70 GHz-days)
Starting trial factoring M10720453 from 2^65 to 2^66 (0.70GHz-days)
Using GPU kernel "cl_barrett15_69_gs_2"
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
May 30 20:24 | 0 0.1% | 0.340 5m26s | 184.51 88629 0.00%
May 30 20:24 | 3 0.2% | 0.330 5m16s | 190.11 88629 0.00%
May 30 20:24 | 8 0.3% | 0.340 5m25s | 184.51 88629 0.00%

M10720453 has a factor: 46314683458850504129

found 1 factor for M10720453 from 2^65 to 2^66 (partially tested) [mfakto 0.14-MGW cl_barrett15_69_gs_2]
tf(): total time spent: 1.010s (59629.21 GHz-days / day)

got assignment: exp=10030583 bit_min=66 bit_max=67 (1.49 GHz-days)
Starting trial factoring M10030583 from 2^66 to 2^67 (1.49GHz-days)
Using GPU kernel "cl_barrett15_69_gs_2"
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
May 30 20:24 | 0 0.1% | 0.710 11m21s | 188.87 88629 0.00%
May 30 20:24 | 12 0.2% | 0.710 11m20s | 188.87 88629 0.00%
May 30 20:24 | 13 0.3% | 0.710 11m19s | 188.87 88629 0.00%
May 30 20:24 | 16 0.4% | 0.710 11m19s | 188.87 88629 0.00%
May 30 20:24 | 21 0.5% | 0.710 11m18s | 188.87 88629 0.00%
May 30 20:24 | 25 0.6% | 0.700 11m08s | 191.57 88629 0.00%
May 30 20:24 | 28 0.7% | 0.720 11m26s | 186.25 88629 0.00%
May 30 20:24 | 33 0.8% | 0.710 11m16s | 188.87 88629 0.00%
May 30 20:24 | 37 0.9% | 0.700 11m06s | 191.57 88629 0.00%

M10030583 has a factor: 77685810496363030583

found 1 factor for M10030583 from 2^66 to 2^67 (partially tested) [mfakto 0.14-MGW cl_barrett15_69_gs_2]
tf(): total time spent: 6.390s (20146.36 GHz-days / day)

got assignment: exp=10715053 bit_min=67 bit_max=68 (2.79 GHz-days)
Starting trial factoring M10715053 from 2^67 to 2^68 (2.79GHz-days)
Using GPU kernel "cl_barrett15_69_gs_2"

Found a valid checkpoint file.
last finished class was: 15

Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
May 30 20:24 | 20 0.6% | 1.330 21m09s | 188.77 88629 0.00%
May 30 20:24 | 23 0.7% | 1.330 21m07s | 188.77 88629 0.00%
May 30 20:24 | 27 0.8% | 1.330 21m06s | 188.77 88629 0.00%
May 30 20:24 | 36 0.9% | 1.330 21m05s | 188.77 88629 0.00%
[/code]

LaurV 2014-05-31 03:45

The things are not related. I mean what you (kracker) say, and what I say. (I am still at the coffee :smile:)

The benchmark is good as it is.

But I just complained (read: nitpicking) about the fact that class-0 has no reason to be excluded. There are plenty of factors in class 0 too:

[CODE]
gp> d=4620; p=7; while(1, p=nextprime(p+1); forstep(k=d,10^10,4620,q=2*k*p+1; if([0,1,0,0,0,0,0,1][q%8+1],if(Mod(2,q)^p==1, print(p","q),
break))); printf("...%d...%c",p,13))
11621,107378041
294403,2720283721
1797589,16609722361
1995317,18436729081
2233547,20637974281
2639761,24391391641
3361639,31061544361
3391831,31340518441
3395123,31370936521
4022371,37166708041
4785769,44220505561
4797073,44324954521
4902077,45295191481
5114959,47262221161
5735981,53000464441
5817389,53752674361
5863271,54176624041
6082619,56203399561
7651261,70697651641
7907231,73062814441
8181751,75599379241
...8708663...
*** at top-level: d=4620;p=7;while(1,p=nextprime(p+1);f
*** ^--------------------
*** user interrupt after 45,481 ms.[/CODE]In the code above, "d" is the class and you can supply any number between 1 and 4619. Of course, for class-0, you have to supply 4620, otherwise all factors are 2*0*p+1=1, and you may get some error - maybe that was James' issue when he queried the DB ? :razz:

edit: Or does mfaktX test the class 0 at the end, same as in my example?? (I don't remember that, but I think is tested first).

NickOfTime 2014-05-31 04:30

hmm, It might be easier to just parse the results.txt and just have people supply the benchmark.txt for the startup variables/card specs :-)

hmm, mfacto showing 87792 GHz-days / day :)

[CODE]got assignment: exp=10515217 bit_min=64 bit_max=65 (0.36 GHz-days)
Starting trial factoring M10515217 from 2^64 to 2^65 (0.36GHz-days)
Using GPU kernel "cl_barrett15_69_gs_2"
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
May 30 22:58 | 0 0.1% | 0.124 1m59s | 257.90 82485 0.00%
May 30 22:58 | 8 0.2% | 0.120 1m55s | 266.50 82485 0.00%
May 30 22:58 | 12 0.3% | 0.118 1m53s | 271.01 82485 0.00%
May 30 22:58 | 15 0.4% | 0.118 1m53s | 271.01 82485 0.00%
May 30 22:58 | 20 0.5% | 0.117 1m52s | 273.33 82485 0.00%
M10515217 has a factor: 30604828569994966241
found 1 factor for M10515217 from 2^64 to 2^65 (partially tested) [mfakto 0.14-Win cl_barrett15_69_gs_2]
tf(): total time spent: 0.689s (44558.05 GHz-days / day)

got assignment: exp=10720453 bit_min=65 bit_max=66 (0.70 GHz-days)
Starting trial factoring M10720453 from 2^65 to 2^66 (0.70GHz-days)
Using GPU kernel "cl_barrett15_69_gs_2"
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
May 30 22:58 | 0 0.1% | 0.227 3m38s | 276.37 82485 0.00%
May 30 22:58 | 3 0.2% | 0.222 3m33s | 282.59 82485 0.00%
May 30 22:58 | 8 0.3% | 0.223 3m33s | 281.32 82485 0.00%
M10720453 has a factor: 46314683458850504129
found 1 factor for M10720453 from 2^65 to 2^66 (partially tested) [mfakto 0.14-Win cl_barrett15_69_gs_2]
tf(): total time spent: 0.686s (87792.28 GHz-days / day)[/CODE][CODE][Fri May 30 22:58:51 2014]
M10515217 has a factor: 30604828569994966241 [TF:64:65*:mfakto 0.14-Win cl_barrett15_69_gs_2]
found 1 factor for M10515217 from 2^64 to 2^65 (partially tested) [mfakto 0.14-Win cl_barrett15_69_gs_2]
M10720453 has a factor: 46314683458850504129 [TF:65:66*:mfakto 0.14-Win cl_barrett15_69_gs_2]
found 1 factor for M10720453 from 2^65 to 2^66 (partially tested) [mfakto 0.14-Win cl_barrett15_69_gs_2]
M10030583 has a factor: 77685810496363030583 [TF:66:67*:mfakto 0.14-Win cl_barrett15_69_gs_2]
found 1 factor for M10030583 from 2^66 to 2^67 (partially tested) [mfakto 0.14-Win cl_barrett15_69_gs_2]
[Fri May 30 22:59:31 2014][/CODE]

hmm, I guess the timestamps are not after every result in results.txt

axn 2014-05-31 06:58

[QUOTE=chalsall;374603]And for any who can bring additional fire-power to bear, it would be welcome, needed and appreciated. :smile:[/QUOTE]

We _really_ need that front page pimpage on GIMPS. George?

NickOfTime 2014-05-31 11:01

[QUOTE=NickOfTime;374665]hmm, It might be easier to just parse the results.txt and just have people supply the benchmark.txt for the startup variables/card specs :-)

hmm, mfacto showing 87792 GHz-days / day :)
[/QUOTE]

probably want to remove Factor=49864411,75,76
from benchmark, my 290x spend 2h on it so far and it also is duplicated..

James Heinrich 2014-05-31 12:52

[QUOTE=LaurV;374660]I don't get it why the class-0 has to be excluded (?!)[/QUOTE]No real reason, I just wanted to give it an opportunity to run through at least one class before being aborted by finding a factor. If a factor is found after 0.001ms of runtime on an exponent I wouldn't put much faith in the displayed processing rate.

Looking at the 60M-70M range, this is the approximate distribution of the first few classes. It's not even, with some classes having many more than others, and several (2, 6, 10, 14, 18, 22, etc) having none at all. Class-0 has some, but not nearly as many as class 1 or 3.[code]+-----------+---------+
| class4620 | howmany |
+-----------+---------+
| 0 | 276 |
| 1 | 19788 |
| 3 | 12456 |
| 4 | 9254 |
| 5 | 4850 |
| 7 | 3130 |
| 8 | 4565 |
| 9 | 3926 |
| 11 | 1803 |
| 12 | 5790 |
| 13 | 1550 |
| 15 | 3222 |
| 16 | 2209 |
| 17 | 1050 |
| 19 | 993 |
| 20 | 2434 |
| 21 | 2085 |
| 23 | 784 |
| 24 | 2907 |
| 25 | 957 |
| 27 | 1261 |
| 28 | 1544 |
| 29 | 615 |
| 31 | 574 |
| 32 | 1092 |[/code]I'm sure there's a perfectly good mathy explanation for it that I wouldn't understand, but I just look at the pretty numbers. :smile:

James Heinrich 2014-05-31 13:17

1 Attachment(s)
[QUOTE=NickOfTime;374680]probably want to remove
Factor=49864411,75,76
from benchmark, my 290x spend 2h on it so far and it also is duplicated..[/QUOTE]You're right, that one is incorrect. The factor is actually just below 2[sup]75[/sup] at 74.9999766 bits so won't actually be found at all. (The value is stored in my database rounded to 5 decimal places, so ended up as 75.0000). Please remove that test(s).

3rd revision attached with additional checking for this kind of borderline case.

NickOfTime 2014-05-31 16:26

[QUOTE=James Heinrich;374686]You're right, that one is incorrect. The factor is actually just below 2[sup]75[/sup] at 74.9999766 bits so won't actually be found at all. (The value is stored in my database rounded to 5 decimal places, so ended up as 75.0000). Please remove that test(s).

3rd revision attached with additional checking for this kind of borderline case.[/QUOTE]

well V2 results...for 290x
Another one to remove...
no factor for M88345373 from 2^65 to 2^66 [mfakto 0.14-Win cl_barrett15_69_gs_2]

[URL="https://dl.dropboxusercontent.com/u/123993183/benchmark_290x.zip"]DropBox 290x Benchmark ziped[/URL]

kracker 2014-05-31 17:18

[QUOTE=James Heinrich;374686]You're right, that one is incorrect. The factor is actually just below 2[sup]75[/sup] at 74.9999766 bits so won't actually be found at all. (The value is stored in my database rounded to 5 decimal places, so ended up as 75.0000). Please remove that test(s).

3rd revision attached with additional checking for this kind of borderline case.[/QUOTE]

Started on my GPU. :smile:

TheJudger 2014-05-31 17:44

[QUOTE=James Heinrich;374683]I'm sure there's a perfectly good mathy explanation for it that I wouldn't understand, but I just look at the pretty numbers. :smile:[/QUOTE]

Preface: (residue-)classes in mfaktX are about the k in the factor candidate (FC) = 2kp+1, not about the FC itself.
[LIST][*]The propability for a factor between 2[SUP]n[/SUP] and 2[SUP]n+1[/SUP] is 1/n. The smallest possible FC is k=1 and has the highest chance to be a factor of 2[SUP]p[/SUP]-1[*]Ignoring p=2, all primes p are 1, 3, 5 or 7 mod 8. For k = 2 all FCs are 5 mod 8 and this disqualifies the whole class for all p because FCs must be 1 or 7 mod 8[LIST][*] 2 * k * p + 1[*] 2 * 2 * 1 + 1 = 5 mod 8[*] 2 * 2 * 3 + 1 = 5 mod 8[*] 2 * 2 * 5 + 1 = 5 mod 8[*] 2 * 2 * 7 + 1 = 5 mod 8[/LIST][*]some classes are more common than other classes, e.g. class 0 is valid for ALL p thus you should have more factors in this class[*]if you stop after a factor is found you'll find more factors in the classes you start.[/LIST]
So if you ignore relative small factors your statistics should be more evenly distributed (but still some classes are more common than others).

Oliver

TheMawn 2014-05-31 17:51

[QUOTE=axn;374673]We _really_ need that front page pimpage on GIMPS. George?[/QUOTE]

I agree. I only found out about GPU72 by looking up the top teams in trial factoring. I had an old i7-920 box chugging away a long time ago and I'd have brought the HD 5770 in as well had I known back then.

kracker 2014-05-31 23:16

I was thinking: it probably would/might be best to skip every other M? I don't think there is a whole lot of difference. 6h later on my slow (160 GHz) I'm at 36M. :smile:

James Heinrich 2014-06-01 00:18

It would probably be just fine to only do every 5th M, with possible later refinement around the "interesting" transition areas when those are identified for various architectures.

NickOfTime 2014-06-01 18:11

[QUOTE=NickOfTime;374699]well V2 results...for 290x
Another one to remove...
no factor for M88345373 from 2^65 to 2^66 [mfakto 0.14-Win cl_barrett15_69_gs_2]

[URL="https://dl.dropboxusercontent.com/u/123993183/benchmark_290x.zip"]DropBox 290x Benchmark ziped[/URL][/QUOTE]

yep
Factor=88345373,65,66
should be
Factor=88345373,64,65

[CODE]OpenCL device info
name Intel(R) Core(TM) i7-4770S CPU @ 3.10GHz (GenuineInt
el)
device (driver) version OpenCL 1.2 AMD-APP (1411.4) (1411.4 (sse2))
maximum threads per block 1024
maximum threads per grid 1073741824
number of multiprocessors 8 (512 compute elements)
clock rate 3105MHz

Automatic parameters
threads per grid 2097152
optimizing kernels for GCN

got assignment: exp=88345373 bit_min=64 bit_max=65 (0.04 GHz-days)
Starting trial factoring M88345373 from 2^64 to 2^65 (0.04GHz-days)
Using GPU kernel "cl_barrett15_69_gs"
No checkpoint file "M88345373.ckp" found.
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
Jun 01 13:08 | 0 0.1% | 1.263 20m11s | 3.01 82485 0.00%
Jun 01 13:08 | 3 0.2% | 1.270 20m17s | 3.00 82485 0.00%
Jun 01 13:08 | 7 0.3% | 1.260 20m06s | 3.02 82485 0.00%
Jun 01 13:08 | 12 0.4% | 1.265 20m09s | 3.01 82485 0.00%
M88345373 has a factor: 36893056995103492273
found 1 factor for M88345373 from 2^64 to 2^65 (partially tested) [mfakto 0.13-W
in cl_barrett15_69_gs_2]
tf(): total time spent: 5.062s (721.87 GHz-days / day)[/CODE]

chalsall 2014-06-03 17:02

[QUOTE=TheJudger;374708]So if you ignore relative small factors your statistics should be more evenly distributed (but still some classes are more common than others).[/QUOTE]

Oliver et al, please forgive me if this is a really stupid question/suggestion...

But might there be upside (defined as factors found per GPU cycle) in deriving an empirical statistical distribution "curve" of the probability of finding a factor in each class for the ranges and depths we're currently working, and do them in that order rather than sequentially?

Taking James' data above, the optimal order to do the classes would be:[CODE]+---------+-----------+
| howmany | class4620 |
+---------+-----------+
| 19788 | 1 |
| 12456 | 3 |
| 9254 | 4 |
| 5790 | 12 |
| 4850 | 5 |
| 4565 | 8 |
| 3926 | 9 |
| 3222 | 15 |
| 3130 | 7 |
| 2907 | 24 |
| 2434 | 20 |
| 2209 | 16 |
| 2085 | 21 |
| 1803 | 11 |
| 1550 | 13 |
| 1544 | 28 |
| 1261 | 27 |
| 1092 | 32 |
| 1050 | 17 |
| 993 | 19 |
| 957 | 25 |
| 784 | 23 |
| 615 | 29 |
| 574 | 31 |
| 276 | 0 |[/CODE]

Perhaps have mfaktX be able to import text files containing current data for different ranges and/or TF depth (since this data is not complete, nor representative of where we're currently working)?

Again, please forgive me (and tell me) if this is a really stupid idea. I can't hold a candle to you guys when it comes to the maths. :smile:

James Heinrich 2014-06-03 17:48

Perhaps I should retract that posting of data. It includes a lot of data on factors irrelevant to our current interest (mfaktx in the approx 2[sup]72[/sup] range), including some P-1 factors and at lot of very tiny factors (<2[sup]60[/sup]). Filtering that out, this data is perhaps more relevant:[code]mysql> SELECT COUNT(*) AS `howmany`, `class4620` FROM `known_factors_006` WHERE (`factorbits` BETWEEN 68 AND 74) GROUP BY `class4620` ORDER BY `howmany` DESC LIMIT 30;
+---------+-----------+
| howmany | class4620 |
+---------+-----------+
| 26 | 1680 |
| 22 | 1344 |
| 22 | 504 |
| 20 | 3696 |
| 20 | 252 |
| 20 | 1380 |
| 20 | 4080 |
| 20 | 1260 |
| 20 | 4320 |
| 19 | 168 |
| 19 | 2520 |
| 19 | 3420 |
| 19 | 300 |
| 18 | 1980 |
| 18 | 792 |
| 18 | 3720 |
| 18 | 3096 |
| 18 | 3840 |
| 17 | 3516 |
| 17 | 960 |
| 17 | 1500 |
| 17 | 3120 |
| 17 | 3000 |
| 17 | 4356 |
| 17 | 84 |
| 16 | 2688 |
| 16 | 3540 |
| 16 | 1596 |
| 16 | 2700 |
| 16 | 3312 |
+---------+-----------+
30 rows in set (0.20 sec)[/code]And you can see the spread is pretty even.

chalsall 2014-06-03 17:59

[QUOTE=James Heinrich;374965]And you can see the spread is pretty even.[/QUOTE]

Thanks James. :smile:

And a big down-side of my suggestion is it would result in a re-enforcing feedback loop.

Best let sleeping dogs lie.

LaurV 2014-06-04 03:13

Just for the sake of clarifying (or say, side discussion, which do not need to influence the already-made bench), the factors do not "spread evenly" in classes. When p vary on the whole range (to 10^9 in our case), we can have, for any 92160 factors: 960 in class 0, 135 in class 1, 0 in class 2 (and all 4k+2 classes, they will have 0 factors), 270 in class 3 and 4, 180 in class 5, 0 on class 6, 162 in class 7, and so on (this are real, statistical numbers, from the modular calculus, and not example values which are fabricated to "give an idea"). So, some classes have "higher probability" to contain factors, which seems somehow normal, as no factor can be even, no factor can be 3 or 5 mod 8, etc.

James Heinrich 2014-06-04 03:38

[QUOTE=LaurV;375009]the factors do not "spread evenly" in classes[/QUOTE]Throwing out more numbers I mostly don't understand:[code]mysql> SELECT COUNT(*), `howmany` FROM (SELECT COUNT(*) AS `howmany`, `class4620` FROM `known_factors_006` WHERE (`factorbits` BETWEEN 68 AND 74) AND (`class4620` IS NOT NULL) GROUP BY `class4620`) AS `a` GROUP BY `howmany`;
+----------+---------+
| COUNT(*) | howmany |
+----------+---------+
| 349 | 1 |
| 505 | 2 |
| 570 | 3 |
| 473 | 4 |
| 366 | 5 |
| 295 | 6 |
| 174 | 7 |
| 166 | 8 |
| 115 | 9 |
| 91 | 10 |
| 61 | 11 |
| 52 | 12 |
| 41 | 13 |
| 28 | 14 |
| 30 | 15 |
| 9 | 16 |
| 8 | 17 |
| 5 | 18 |
| 4 | 19 |
| 5 | 20 |
| 1 | 21 |
| 2 | 22 |
| 1 | 26 |
+----------+---------+
23 rows in set (0.17 sec)[/code]Where "howmany" is the number of factors in a given class, and COUNT(*) is how many times that happens. For example, there is only one instance of a class (#1680 in this case) having 26 factors in the same class.
In short it means that the bulk of the 16359 factors in my data are spread across many different classes: 98% of the factors have less than 10 factors found in the same class. If my poor explanation makes any sense.
So perhaps not quite "spread evenly" (implying uniform distribution) but not nearly as biased as I made it look in post #2972.

axn 2014-06-04 03:58

We should expect the factors to be spread evenly. There is no mathematical reason not to. Unless ... P-1.

P-1 factors are NOT expected to fall evenly between the various classes. The upshot is that candidates that has had a (failed) P-1 should behave "unevenly" when it comes to TF factors, and the candidates that haven't had P-1 should be "even". That's the theory, anyway.

Can you do the analysis by removing the TF factors which had a prior P-1?

James Heinrich 2014-06-04 04:18

[QUOTE=axn;375015]Can you do the analysis by removing the TF factors which had a prior P-1?[/QUOTE]Not easily, no, sorry.

LaurV 2014-06-04 05:57

[QUOTE=axn;375015]We should expect the factors to be spread evenly. There is no mathematical reason not to.[/QUOTE]
Ooo... happy to contradict you for the first time in my life! :razz:
There is a big reason, is called "modular arithmetic".
Assume you have 12 classes, then p (which is a prime) can only be 1,5,7, or 11 (mod 12), so let's put it in a table. I put p on the header, and k on the first column. I compute q=2kp+1, and if this is 3,9 (mod 12) then it can't be prime, also, if it is 3,5 (mod 8) it can't be a factor of a mersenne with odd exponent. I put a 0 in the table in both cases. If not, I put a 1 in the table. Then I sum by columns and rows.

[CODE]
12 Classes
k\p 1 5 7 11
0 1 1 1 1 4
1 0 0 0 1 1
2 0 0 0 0 0
3 1 1 0 0 2
4 0 1 0 1 2
5 0 0 1 0 1
6 0 0 0 0 0
7 0 1 0 0 1
8 1 0 1 0 2
9 0 0 1 1 2
10 0 0 0 0 0
11 1 0 0 0 1

4 4 4 4 16
[/CODE]You see what happens, for a given p (pick any column), there are always 4 classes from 12, i.e 1/3 "to be tested", because the other 8 either can't generate primes, or their primes are not 1,7, (mod 8), and they can be ignored. For 4620 classes, this number is 960 classes to be tested, what mfaktX is doing.

Switching our attention to lines instead of columns, few lines have no "1"s and their "total" is zero. You can not have factors in these classes, when p varies in the primes set. For some other classes, only ONE p in 4 can have a factor in that class. For other, half of the primes can have factors in the class, etc. Obviously when I say "p may have factors" I refers to its mersenne number, because p is always a prime.

The probability for a p to "have a factor" in a class in not equal, for different classes. Primes which are 1,5,7 (mod 12) can't have factors in class 1, for example. Only primes which are 11 (mod 12) can, which is a quarter of primes, only.

I won't post the 4620 classes table here, it is too big, but to have an idea, this is the table to 60 classes, which shows quite clear the phenomenon:

[CODE]60 Classes
k\p 1 7 11 13 17 19 23 29 31 37 41 43 47 49 53 59
0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 3
3 1 0 0 1 1 0 0 0 0 1 1 0 0 0 1 0 6
4 0 0 1 0 1 0 0 1 0 0 1 0 1 0 0 1 6
5 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 0 4
7 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 3
8 1 1 0 1 0 0 0 0 1 1 0 1 0 0 0 0 6
9 0 1 1 0 0 1 0 0 1 0 0 0 1 0 0 1 6
11 1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 3
12 0 1 0 1 1 1 1 1 0 1 0 1 1 1 1 1 12
13 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 3
15 1 0 0 1 1 0 0 1 0 1 1 0 0 1 1 0 8
16 0 0 1 0 0 0 1 1 0 0 1 0 0 0 1 1 6
17 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 3
19 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 3
20 1 1 0 1 0 1 0 0 1 1 0 1 0 1 0 0 8
21 0 0 1 0 0 1 1 0 1 0 0 1 0 0 0 1 6
23 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 3
24 1 1 1 0 1 1 0 1 1 1 1 0 1 1 0 1 12
25 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 1 4
27 0 0 0 1 1 0 0 1 0 1 0 0 0 1 1 0 6
28 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 6
29 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 3
31 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 3
32 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 0 6
33 0 1 1 0 0 0 1 0 1 0 0 1 1 0 0 0 6
35 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 4
36 1 0 1 1 0 1 1 1 1 0 1 1 0 1 1 1 12
37 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 3
39 1 0 0 0 1 0 0 1 0 1 1 0 0 1 0 0 6
40 0 0 1 0 1 0 1 1 0 0 1 0 1 0 1 1 8
41 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 3
43 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 3
44 1 1 0 0 0 1 0 0 1 1 0 0 0 1 0 0 6
45 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 8
47 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 3
48 1 1 1 1 1 0 1 0 1 1 1 1 1 0 1 0 12
49 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 1 3
51 1 0 0 1 0 0 0 1 0 0 1 0 0 1 1 0 6
52 0 0 0 0 1 0 1 1 0 0 0 0 1 0 1 1 6
53 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 3
55 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 4
56 1 0 0 1 0 1 0 0 1 0 0 1 0 1 0 0 6
57 0 1 0 0 0 1 1 0 0 0 0 1 1 0 0 1 6
59 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 3

16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 256
[/CODE]

axn 2014-06-04 08:18

[QUOTE=LaurV;375020]Ooo... happy to contradict you for the first time in my life! :razz:[/QUOTE]

There is no contradiction. Only confusion. Apparently, everyone has their own way of enumerating the classes. I was going by f, you're going by k, and god only know what James's data is going by. As per your table, class 0 should out number everything, but it is class 1 as per James's data.

Regardless, all the classes that are used for TF should have equal probability of a factor (actually, earlier class should have slightly better chance, since we stop factoring as soon as we find a factor). So there is no difference in changing the order in which classes are TF'ed (chalsall's idea).

ET_ 2014-06-04 09:20

[QUOTE=axn;375022]There is no contradiction. Only confusion. Apparently, everyone has their own way of enumerating the classes. I was going by f, you're going by k, and god only know what James's data is going by. As per your table, class 0 should out number everything, but it is class 1 as per James's data.

Regardless, all the classes that are used for TF should have equal probability of a factor (actually, earlier class should have slightly better chance, since we stop factoring as soon as we find a factor). So there is no difference in changing the order in which classes are TF'ed (chalsall's idea).[/QUOTE]

Can I assue that ALL TF done with mfaktc was completed? Or there may be cases where TF has been interrupted right after finding a factor, without completing the bit level?

Luigi

LaurV 2014-06-04 11:38

There is only one way to enumerate "classes", and that is "by k". The term was coined by Oliver long ago, when he did first mfaktc. The "960 classes", "4620 classes", etc, are related to k. Otherwise how do you explain "even" classes? Because q (or f, as per axn said) is always odd prime and 1,7 mod 8... The "classes" here is what mfaktX is using, and they refer to modularity of k to 4620 (or 420, for the "less classes" version). We are interested what values k can take, once p is given, those are "candidate classes" and those we are sieving, powering, blah blah.

James Heinrich 2014-06-04 13:53

[QUOTE=ET_;375024]Can I assue that ALL TF done with mfaktc was completed? Or there may be cases where TF has been interrupted right after finding a factor, without completing the bit level?[/QUOTE]From my data you can assume nothing. It [i]might[/i] have had the bitlevel completed, it [i]might[/i] have aborted after finding the first factor. Remember also that I make no claim that all these factors [i]were[/i] found by mfaktx, only that they're in the range of interest to us and they [i]could have[/i] been found by mfaktx.

chalsall 2014-06-04 14:04

[QUOTE=ET_;375024]Can I assue that ALL TF done with mfaktc was completed? Or there may be cases where TF has been interrupted right after finding a factor, without completing the bit level?[/QUOTE]

The latter is what most people do.

James Heinrich 2014-06-04 14:42

[QUOTE=chalsall;375034]The latter is what most people do.[/QUOTE]Arguably more efficient (in terms of exponents cleared), and yet [i]not[/i] the mfaktc default (I believe it was the mfakto default, last time I checked).

Bdot 2014-06-05 16:39

[QUOTE=James Heinrich;375037]Arguably more efficient (in terms of exponents cleared), and yet [I]not[/I] the mfaktc default (I believe it was the mfakto default, last time I checked).[/QUOTE]
That's correct.


Regarding the performance measurement to be built into mfaktX, Oliver and I were thinking about this:

Implement a test routine that runs a particular kernel with a fix exponent (Oliver's favorite 65M one) and a certain factor candidate size [B]for a fix amount of time[/B] (e.g. 3 seconds) and see how far it came.

This routine can be used to loop over kernels and FC size. As we do have a performance dependency to the exponent size as well, I'd allow that to be variable as well.

I will extend the --perftest option of mfakto to do the test as James had suggested (~two pages earlier). Similar to the sieving performance test that I already have, it will read exponents and FC sizes to test from the ini file. I need to figure out how to properly combine these loads of generated data, though.

Example how it is today:
[code]
inifile:
TestSievePrimes=19890,30738,47503,73411,99999,113449,122222,175323,270944,418716,647083,1075766
TestGPUSieveSizes=4,7,12,16,20,36,48,96,101,102,104,120,121,123,124,125,126,127,128

mfakto --perftest:
...
4. GPU sieve, 10 iterations each

gpusieve_init: 8.685000 ms (CPU work)
gpusieve_init_exponent: 1.841100 ms (CalcModularInverses)
gpusieve_init_class: 0.355800 ms (CalcBitToClear)
gpusieve: 4.861600 ms (SegSieve)
tf: 39.414000 ms = 3352.123002 M/s (raw rate, cl_barrett15_69_gs)

GPU sieve raw rate (input rate M/s)
SievePrimes: 19890 30738 47503 73411 99999 113449 122222 175323 270944 418716 647083 1075766
GPUSieveSize
4 MBit 14108.0 11370.3 6756.6 6382.9 4859.2 4398.7 4139.5 3137.5 2187.4 1434.3 913.2 457.7
7 MBit 16732.6 15099.3 12317.8 9777.7 7759.9 7098.8 6768.9 5119.1 3674.5 2427.2 1519.5 763.7
12 MBit 20467.2 18618.0 15609.4 12286.1 9884.3 9147.8 8664.4 6530.8 4511.1 3056.0 1970.5 1052.2
16 MBit 23049.3 20749.9 17868.8 14529.6 11471.9 10468.8 9777.1 7412.3 5195.1 3472.1 2262.1 1216.6
20 MBit 24060.3 22617.2 18829.5 15106.1 12188.3 11205.6 10609.4 8223.9 5822.3 3908.1 2536.7 1329.9
36 MBit 28627.2 26998.6 23338.9 18942.3 15389.7 14035.3 13213.7 10023.2 6860.0 4540.2 2698.0 1165.2
48 MBit 30928.9 29083.8 25581.4 21601.5 17754.1 15811.9 13833.5 10492.9 7241.4 4719.9 2582.8 1199.7
96 MBit 34094.0 32564.6 29625.9 23068.5 18816.0 16736.3 15876.7 11714.5 7470.0 4655.9 1841.1 309.5
101 MBit 34552.9 32000.9 30054.5 23895.0 18857.0 17027.3 16032.5 11663.9 7357.6 4761.7 1646.6 308.4
102 MBit 33476.8 32197.7 29504.7 23519.2 18718.2 16951.5 16002.2 11640.5 7404.7 4485.3 1898.6 269.2
104 MBit 34359.6 33008.2 29945.4 23970.6 19012.3 17537.0 16205.4 11688.5 7431.7 4642.9 1935.0 288.1
120 MBit 34437.5 33495.4 31195.2 24505.2 19296.7 17514.3 16294.4 11727.8 7335.6 4586.1 1262.3 231.2
121 MBit 35271.0 33778.4 30574.0 24273.9 19359.2 17400.3 16475.6 11664.5 7448.3 4485.8 1306.4 223.6
123 MBit 35108.7 33998.7 30944.4 24445.0 19096.9 17458.2 16578.0 11906.2 7494.3 4651.0 1175.9 226.3
124 MBit 35551.9 33176.6 31075.1 24321.1 19549.2 17519.3 16553.2 11973.6 7502.1 4430.9 1719.0 216.8
125 MBit 34252.3 33766.1 31011.4 24501.4 19032.9 17361.4 16426.7 11899.2 7382.4 4451.5 1240.7 239.7
126 MBit 35792.7 34077.2 31133.6 24821.3 19502.1 17596.9 16568.5 12003.7 7362.5 4532.9 948.2 244.3
127 MBit 34532.6 32777.0 30255.3 24371.9 19279.2 17731.3 16766.9 11837.3 7423.6 4535.6 1480.0 232.5
128 MBit 33707.9 33267.3 31661.8 24512.2 19356.4 17596.0 16445.3 11965.3 7430.9 4559.3 923.9 245.5

Best GPUSieveSize for
SievePrimes: 19766 31030 47414 74038 99894 113206 122422 175670 270902 419382 647734 1075766
at MiB: 126 126 128 126 124 127 127 126 124 101 36 20
max M/s: 35792.7 34077.2 31661.8 24821.3 19549.2 17731.3 16766.9 12003.7 7502.1 4761.7 2698.0 1329.9
Survivors: 22.18% 21.32% 20.57% 19.84% 19.38% 19.19% 19.08% 18.56% 17.98% 17.44% 16.94% 16.41%
removal rate
average: 27855.6 26812.1 25147.4 19895.9 15760.7 14328.4 13568.5 9776.1 6153.0 3931.2 2240.9 1111.7
incremental: n/a 6083.2 3326.5 840.4 427.5 357.1 356.7 219.0 115.1 70.6 31.0 14.0
[/code]Another, slightly related idea that I will implement: Allow for dynamic reconfiguration of mfakto during runtime: (GPU)SievePrimes, SieveSize, maybe even select the kernel to be used. This way it would be easier to find the optimal values for a certain task.

chalsall 2014-06-10 15:02

"Ownership" of GPU72 assigned candidates...
 
This has been discussed before, but just to reiterate...

A GPU'er emailed me last night concerned that the assignments he was getting from GPU72 weren't "owned" by "gpufactor" ("GPU72") according to Primenet, but instead "owned by someone else".

So everyone knows, the only candidates assigned by GPU72 for TF'ing or P-1'ing will be owned by either "gpufactor" ("GPU72") or "wabbit" ("For Research"). GPU72 is not doing any "poaching" nor parallel processing -- the latter account is simply being used to collect recycled candidates from Primenet as they expire. "wabbit's" holdings will appear as either LL or P-1 assignments on Primenet, instead of TF.

Mark Rose 2014-06-17 15:24

[QUOTE=NickOfTime;373874]Just a few gpu's :-) 2 AMD 290x, amd 260x, gtx 690, gtx 660ti, gtx 560ti[/QUOTE]

And yesterday I put a spare GTX 520 into service on my work desktop for 28.8 GHz-d/day. I need better cards lol

Mark Rose 2014-06-17 16:13

[QUOTE=Xyzzy;373606]If anyone is successful in getting CUDA running on Ubuntu 12.04.4 or 14.04 please let us know.

[url=http://www.mersenneforum.org/showpost.php?p=365319&postcount=2301]We had it working via the package manager on 12.04.1[/url] but it is horribly broken now.

:help:[/QUOTE]

On a clean 14.04 install I've gotten it to work by installing the nvidia-331 and nvidia-cuda-dev and nvidia-cuda-toolkit packages. That should give you CUDA 6 in the driver and CUDA 5.5 for development. Compiling mfaktc then works.

I've gotten bumblebee to work as well, but I had to modify /etc/bumblebee/bumblebee.conf by making sure the following is set (when using nvidia-331):
[code]
## Section with nvidia driver specific options, only parsed if Driver=nvidia-331
[driver-nvidia-331]
# Module name to load, defaults to Driver if empty or unset
KernelDriver=nvidia-331
PMMethod=none
# colon-separated path to the nvidia libraries
LibraryPath=/usr/lib/nvidia-331:/usr/lib32/nvidia-current
# comma-separated path of the directory containing nvidia_drv.so and the
# default Xorg modules path
XorgModulePath=/usr/lib/nvidia-331/xorg,/usr/lib/xorg/modules
XorgConfFile=/etc/bumblebee/xorg.conf.nvidia
[/code]

That allows me to use the on-board Intel video for doing work on the machine, and letting the Nvidia card factor away. Just run `cd mfaktc ; optirun ./mfaktc.exe -d 0`. I haven't figured out how to get bumblebee to work with 2+ nvidia cards.

James Heinrich 2014-06-18 18:02

Random minor observation: When you request assignments, it presents you a list of assignments and tells you to "cut and paste", but since the assignments aren't in an editable HTML element you can [i]copy[/i], but not actually [i]cut[/i]... Perhaps it should say "copy and paste" instead?

Pedantic, but what do you expect from programmers? :cmd:

chalsall 2014-06-18 18:28

[QUOTE=James Heinrich;376154]Pedantic, but what do you expect from programmers? :cmd:[/QUOTE]

ROFL... :smile:

Good point. Language can be so very important...

Unfortunately, changing that single word would involve a bit of work since it is in the "translation database". I would have to add a new (more correct) phrase, update four Perl scripts, and then the translators would have to submit translations. Thus, I'm going to leave it as it is...

If it helps at all, your reported bug about the dates being rendered badly [URL="https://www.gpu72.com/reports/worker/56f1b7572536a14513b08c88b2ba9578/"]on the individual overall workers' productivity report[/URL] was fixed a few days ago... (I never, ever, imagined this sub-project would exist for so long!)

James Heinrich 2014-06-18 18:31

[QUOTE=chalsall;376155]your reported bug about the dates being rendered badly [URL="https://www.gpu72.com/reports/worker/56f1b7572536a14513b08c88b2ba9578/"]on the individual overall workers' productivity report[/URL] was fixed a few days ago...[/QUOTE]Yay! Thanks :smile:

TheMawn 2014-06-19 00:42

[QUOTE=chalsall;376155](I never, ever, imagined this sub-project would exist for so long!)[/QUOTE]

Whaaaaaat? Involving GPU's is a stroke of genius, and so is putting together a system to make it user friendly.

chalsall 2014-06-19 01:33

[QUOTE=TheMawn;376171]Whaaaaaat? Involving GPU's is a stroke of genius, and so is putting together a system to make it user friendly.[/QUOTE]

Very kind words. Thank you.

But "The Judger" and "Bdot" were really the ones who brought GPU'ing into this space. And "Mr. P-1" used to coordinate this kind of work manually.

I simply came in afterwards, and stood on the shoulders of giants....

TheMawn 2014-06-19 08:20

[QUOTE=chalsall;376173]and stood on the shoulders of giants....[/QUOTE]

Reminds me of the Epic Rap Battle between Stephen Hawking and Albert Einstein. Hawking says he's going to "drop mad apples on his head from the shoulders of giants" and Einstein's comeback, is, well... Just see for yourself.

[url]http://www.youtube.com/watch?v=zn7-fVtT16k[/url]

chalsall 2014-07-05 19:03

For those using the GPU72 Proxy for P-1 work...
 
Just a heads up...

I don't know how long Primenet is going to be offline (hopefully not long), but for those who use the GPU72 Proxy to get P-1 work, and have your machines collect only a small number of assignments, please be aware that the way Prime95/mprime works means that they won't get additional assignments from GPU72 until Primenet is back up.

The reason is that the client's send "Update Computer" messages about once a day, which the Proxy sends onto Primenet for fulfillment. Since an error message is currently being returned from Primenet, this is then sent back to the client which then doesn't communicate further.

Thus, it would be advisable to manually get a few P-1 assignments from GPU72 to ensure your P-1 workers are "fed".

Unfortunately, DC and LL assignments are no longer available from GPU72 manually, so such workers might "go hungry" until Primenet comes back online.

Edit: Never mind... Immediately after I posted this Primenet came back online! Thanks George and Scott!!! :smile:

chalsall 2014-07-29 16:42

Taking LL Cat1 and Cat2 candidates to 74...
 
Just a heads up...

As we're now finally "ahead of the wave" for Cat4, and well ahead for Cat3, I thought it would be worthwhile to start bringing the current Cat1 and Cat2 candidates up to 74 since we have the fire power, and James' analysis suggests this [URL="http://www.mersenne.ca/cudalucas.php?model=12"]Makes Sense [SUP][SUP]TM[/SUP][/SUP][/URL] for LLTF'ing.

Currently this is only available by way of the automated assignment spiders (and only for those who choose "Let GPU72 Decide"), as it takes me only a few minutes to implement same. This weekend I will make this work-type available via the GPU72 manual assignment page.

For anyone who doesn't want to do such low and deep work, simply change your settings from LGD to WMS, or any other setting. Those who run [URL="http://www.mersenne.ca/cudalucas.php?model=564"]Titan's[/URL] might want to consider doing so, as the optimal curves cross at lower levels. The same with those who run ATI/AMD kit.

And, please, let me know if anyone thinks doing this is sub-optimal.

James Heinrich 2014-07-29 16:58

Perfect timing, I'm just finishing the last 25 of 1000 assignments I picked up, so I can get back to my [url=http://www.mersenne.ca/tf1G.php?available_assignments=1]pet project[/url]. So you'll be losing whatever limited firepower I've given over the last couple months, sorry.


All times are UTC. The time now is 22:45.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.