mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU to 72 (https://www.mersenneforum.org/forumdisplay.php?f=95)
-   -   RIP DCTF. (https://www.mersenneforum.org/showthread.php?t=19897)

LaurV 2015-12-19 06:24

[QUOTE=flagrantflowers;419627]Is continuing this project a good idea? I think that the point has been raised that a number of older cards stop being efficient at higher bits. [/QUOTE]
This is true, but there is enough work for everybody for the next 10 years or more, factoring higher exponents to the current bitlevels (in that time we will have better hardware anyhow, but that is a different story). But for your (general you) peace of mind, it is not like some hardware will have nothing to do after DCTF is finished. There is a lot of work to do "in front of the LL front", for example. Factoring everything in [URL="http://www.mersenne.ca/status/tf/1/1/0"]this table[/URL] to 72-73 bits or so will still need a lot of time and resources... As participation shifts toward other projects too, due to the things diversifies in the "distributed computing world", we may not see it done during our life time...

airsquirrels 2015-12-19 14:58

Everyone is entitled to different views, interests and opinions. To me this sounds like going around town breaking windows to keep the glassiers employeed.

The goal is to find more primes, the best way to do that is to have everyone searching for primes or eliminating not-primes. Right now there are a lot of various workloads and efficiency questions that mean efforts are divided across many fronts. My hope is by finishing off DCTF (and maybe even catching DC up a bit) those efforts will move towards LL tests and we will find another prime faster.

In an ideal world I would just say we shouldn't do DCTF until just in time, especially since hardware keeps getting faster and by the time the DC front actually catches up it would take significantly less power/electricity/time to check those ranges. Unfortunately people disagree and put lots of resources into DCTF. My response is 'ok, if we're going to do this then let's just put the power towards it and get it done so we can all get back to the real work.' Plus everyone likes to experience completing a task and celebrating that milestone. I agree with an earlier post that GPU72 makes celebrating achievable milestones much more exciting than mersenne.org.

My personal view is DC itself is not an efficient use of resources, and the current DC process will not scale up to bigger and bigger exponents. We will need incremental DC alongside Ll or other ways to increase our confidence in LL results without just doing the work twice or too many resources will be wasted.

chalsall 2015-12-19 15:57

[QUOTE=flagrantflowers;419627]Is continuing this project a good idea? I think that the point has been raised that a number of older cards stop being efficient at higher bits.[/QUOTE]

It's not older cards so much as different classes of cards. AMD's GPUs tend to fall off at higher bit levels compared to NVidia's GPUs, for example. And then within the NVidia offerings, each "Compute version" affects the "optimal TF'ing level". See James' excellent analysis [URL="http://www.mersenne.ca/cudalucas.php?model=12"]here[/URL].

And while I agree that completing DCTF'ing doesn't really make sense at the moment (we're YEARS ahead of the DC'ing waves) some want to do this, and I'm just a facilitator. I'm hopeful that when the DCTF'ing is completed most of the resources will then move to LLTF'ing (or GPUP-1'ing, DC'ing or LL'ing).

Keep in mind that when DCTF is completed, cards which are more efficient at lower bit levels still have LLTF 71 to 72 work available; I'll ensure some of that continues to be available (in the higher ranges) for just such cards.

Doing LLTF'ing "to release" (read: to 75 bits at the moment) is the most important, but all other work needs to be done as well, and is useful and helpful.

petrw1 2015-12-29 19:05

Done to 50M
 
YAY

Mark Rose 2015-12-29 19:46

[QUOTE=petrw1;420443]YAY[/QUOTE]

It's tempting to poach those 8 remaining assignments below 73 bits lol

airsquirrels 2015-12-29 19:47

They belong to bdot. They are expected soon

airsquirrels 2015-12-30 14:15

[QUOTE=Mark Rose;420444]It's tempting to poach those 8 remaining assignments below 73 bits lol[/QUOTE]

Now we're just waiting on you!

One left at 70 bit/50M and we will be clear to 72 bit for all DCTF.

Mark Rose 2015-12-30 16:09

[QUOTE=airsquirrels;420522]Now we're just waiting on you!

One left at 70 bit/50M and we will be clear to 72 bit for all DCTF.[/QUOTE]

LOL. That was captured by my slowest machine, minutes after I wrote that post. I didn't notice. I've moved it to a different machine and it will be done in a few hours instead of a day.

airsquirrels 2015-12-30 20:51

Everything less than 72 is now at 72!

petrw1 2016-01-04 16:46

Update #20 Jan 4, 2016
 
1,183,694 GhzDays in the last month. (BIG SURGE!!!!)
22 different contributors
957 Factors found
92,511 P1/LL/DC work saved

18 contributors currently have assignments
16,746 Assignments out.

41 estimated days to completion (BIG DROP!!!)
Just over 1 month .. February 14, 2016 (DC Love is in the air)
(Seems it may finish while I'm away...and I'll miss the party :cry: )

MILESTONES
Complete to 50M.
Complete to 72 Bits.

airsquirrels 2016-01-04 17:46

Might be even sooner, the patch to mfakto to eliminate PCIe bottlenecks was significantly more beneficial for my systems than expected, nearly a 30% gain.

[url]http://www.gpu72.com/reports/worker_exact/fc17ac967304842d91106208c22430de/[/url]

petrw1 2016-01-22 19:03

Update #21 Jan 22, 2016
 
A little early but I am off soon to the Land Down Under until March.

933,295 in the last month. (Waiting for the big DROP from ANONYMOUS)
21 different contributors
583 Factors found
59,477 P1/LL/DC work saved

19 contributors currently have assignments
31,561 Assignments out.

39 estimated days to completion (a little slow down but still going great guns)
Just over 1 month .. Right around March 1, 2016
(Maybe I will be back for the funeral)

airsquirrels 2016-01-22 19:07

We've slowed down DCTF work quite a bit trying to keep the bins full for all the LL demand following the M49 press. Hopefully in a week or two we will be able to return to killing DCTF

chalsall 2016-01-22 19:35

[QUOTE=airsquirrels;423566]We've slowed down DCTF work quite a bit trying to keep the bins full for all the LL demand following the M49 press. Hopefully in a week or two we will be able to return to killing DCTF[/QUOTE]

Yeah, thanks for bringing your supercomputer over to LLTF! It helped _a lot_! :smile:

petrw1 2016-01-22 21:44

2 hours later and ANONYMOUS checks in 140K
 
...
[QUOTE=petrw1;423564]A little early but I am off soon to the Land Down Under until March.

1,042,341 [STRIKE]933,295[/STRIKE] in the last month.
21 different contributors
614 [STRIKE]583[/STRIKE] Factors found
63,763 [STRIKE]59,477[/STRIKE] P1/LL/DC work saved

19 contributors currently have assignments
33,959 [STRIKE]31,561[/STRIKE] Assignments out.

31 [STRIKE]36[/STRIKE] estimated days to completion (a little slow down but still going great guns)
1 month .. February 20, 2016 [STRIKE]March 1, 2016[/STRIKE]
(Maybe I won't [STRIKE]will[/STRIKE] be back for the funeral)[/QUOTE]

ATH 2016-01-29 08:01

Has all DCTF work been assigned now? The "Get DCTF" page links to the "Get LLTF" page.

chalsall 2016-01-29 13:59

[QUOTE=ATH;424517]Has all DCTF work been assigned now? The "Get DCTF" page links to the "Get LLTF" page.[/QUOTE]

Yes, all DCTF candidates have now been assigned (unless we decide to take 60M to 63M up to 74 bits). This was mentioned on the GPU72 Status thread. Sorry, I should have mentioned it here as well.

The reason the "Get DCTF" page links to the "Get LLTF" page was to ensure that any automatic fetching spiders didn't run out of work.

airsquirrels 2016-01-29 20:10

[QUOTE=chalsall;424548]...Unless we decide to take 60M to 63M up to 74 bits...[/QUOTE]

It seems likely that those 24k assignments should get assigned since the need to move them to 74 bits isn't going to just disappear, however I'm not anxious to include them in the initial DCTF burst since it is already moving the goalpost from where we started. They are also well beyond any active DCTF work.

I also wonder if those levels are still the desired transition point for the average card in the GPU72 fleet. Was it not the case that most of those level transition points were made for older cards?

Mark Rose 2016-01-29 20:40

[QUOTE=airsquirrels;424584]It seems likely that those 24k assignments should get assigned since the need to move them to 74 bits isn't going to just disappear, however I'm not anxious to include them in the initial DCTF burst since it is already moving the goalpost from where we started. They are also well beyond any active DCTF work.

I also wonder if those levels are still the desired transition point for the average card in the GPU72 fleet. Was it not the case that most of those level transition points were made for older cards?[/QUOTE]

Well, for my GTX 580s, it makes no sense. I should only take DCTF to 74 above 66M. Last year I went ahead and did all the DCTF that makes sense for a GTX 580 between 60M and 105M, including exponents assigned as first-time LL with not enough TF for DC. So unless PrimeNet runs out of available exponents, DCTF should be done for a GTX 580, baring some algorithmic improvement in TF throughput.

The clients don't report what cards are being used, so it's hard to say what the average card in the fleet is. I do believe that the GTX 580 was used as the basis for the cut over, at least comparing what I see on GPU72 versus James' graphs at mersenne.ca. When Chris expanded the current levels chart I remember him saying he didn't adjust the colours of the cells.

I gather with the AMD cards that they are even more better at TF than LL than Nvidia cards, so it makes sense to take lower exponents to a higher bit level (keeping in mind that Nvidia does better at higher bit levels).

It would be nice if there were a way for a card to indicate its relative abilities or model to GPU72, which could then make a better decision as to what the card should do when letting GPU72 decide. However, with the bulk of the work being exponents needing TF to 75 bits, if a pile of AMD cards are doing TF, they're going to end up doing that work anyway.

dragonbud20 2016-01-29 23:56

[QUOTE=Mark Rose;424591]
The clients don't report what cards are being used, so it's hard to say what the average card in the fleet is. I do believe that the GTX 580 was used as the basis for the cut over, [/QUOTE]

I wonder if it would be possible to add a section to your GPU72 profile or maybe your prime net one to list what graphics cards you trial factor with as it would help to at least figure out what cards people generally use. any of the guys in charge know if this could be done?

airsquirrels 2016-01-30 00:59

[QUOTE=Mark Rose;424591]Well, for my GTX 580s, it makes no sense. I should only take DCTF to 74 above 66M. Last year I went ahead and did all the DCTF that makes sense for a GTX 580 between 60M and 105M, including exponents assigned as first-time LL with not enough TF for DC. So unless PrimeNet runs out of available exponents, DCTF should be done for a GTX 580, baring some algorithmic improvement in TF throughput.

The clients don't report what cards are being used, so it's hard to say what the average card in the fleet is. I do believe that the GTX 580 was used as the basis for the cut over, at least comparing what I see on GPU72 versus James' graphs at mersenne.ca. When Chris expanded the current levels chart I remember him saying he didn't adjust the colours of the cells.

I gather with the AMD cards that they are even more better at TF than LL than Nvidia cards, so it makes sense to take lower exponents to a higher bit level (keeping in mind that Nvidia does better at higher bit levels).

It would be nice if there were a way for a card to indicate its relative abilities or model to GPU72, which could then make a better decision as to what the card should do when letting GPU72 decide. However, with the bulk of the work being exponents needing TF to 75 bits, if a pile of AMD cards are doing TF, they're going to end up doing that work anyway.[/QUOTE]

I am curious why the AMD cards have this performance cutoff vs. the NVIDIA cards. I know we jump to a different kernel because we need more bits, but that seems like something that happens on both cards/architecture. Has the mfaktc code just had more tuning for high-bit kernels? Or conversely was more effort in mfakto code just put into low bit kernels? It hardly seems like an actual hardware difference.

Similarly, in the time since the 580 cut-off point analysis was done mfakto, mfaktc, and clLucas (and probably CUDALucas) as well as the AMD and NVIDIA drivers have all changed substantially. At best the cutoff point is a very rough estimate.

Once we are sufficiently ahead of the CPU DC and CPU LL wavefronts with TF, will most of us actually turn those cards to LL testing? Or will we just trial factor to infinity (and beyond)?

Mark Rose 2016-01-30 01:12

[QUOTE=airsquirrels;424619]I am curious why the AMD cards have this performance cutoff vs. the NVIDIA cards. I know we jump to a different kernel because we need more bits, but that seems like something that happens on both cards/architecture. Has the mfaktc code just had more tuning for high-bit kernels? Or conversely was more effort in mfakto code just put into low bit kernels? It hardly seems like an actual hardware difference.[/quote]

I don't know enough to say.

[quote]
Similarly, in the time since the 580 cut-off point analysis was done mfakto, mfaktc, and clLucas (and probably CUDALucas) as well as the AMD and NVIDIA drivers have all changed substantially. At best the cutoff point is a very rough estimate. [/quote]

I do know that James periodically updates his tables when new versions are released. For instance, mfaktc 0.21 gave about a 1.5% increase over 0.20.

[quote]Once we are sufficiently ahead of the CPU DC and CPU LL wavefronts with TF, will most of us actually turn those cards to LL testing? Or will we just trial factor to infinity (and beyond)?[/QUOTE]

I haven't given thought as to what I will do. Until a few months ago when the surge started, I had a goal of finishing DCTF that I expected to last a couple more years lol

airsquirrels 2016-01-30 01:32

[QUOTE=Mark Rose;424620] I haven't given thought as to what I will do. Until a few months ago when the surge started, I had a goal of finishing DCTF that I expected to last a couple more years lol[/QUOTE]

As long as there is TF work to do that makes sense in the LL or 100M range I intend to put both HW and hopefully mental/code effort into that work. At some point when we are far enough ahead I hope to put all my iron into DC-LL until I'm reasonably satisfied in its stability for 100M testing. Currently all my Titan-era NVIDIA cards are working DC-LL.

My profiling on mfakto so far has shown it to be pretty effectively loading the VALUs and not memory bound/stalled at all after my change to on-card memory, which makes sense. Kernel/workgroup scheduling is quite reasonable with long running kernels and low overhead for most active TF ranges. 90+% of the work is in the actual factoring kernel and the sieve efficiency isn't really worth tuning. George's suggestion of looking at hand-tuned ISA that makes use of the hardware carry flag and add+carry instructions to reduce instruction count is the only big place I see for easy improvement and that's not the big percentage of the VALU instructions. The SALU is mostly sitting idle but I don't see a worthwhile use of it with the current architecture. Summary, I can't see much headroom for more than a few more percentage points of improvement for mfakto. Since they are so similar in structure I believe mfaktc will be similar.

clLucas is an entirely different story, with pipeline stalls, cache misses, and extremely excessive kernel queuing and the associated overhead. This isn't msft's fault so much as the fact that clFFT was not really designed for this kind of work (EDIT: I'm referring to being called over and over again in a tight loop with other kernels running before and after.) I am fairly confident at this stage that at least a 2x improvement in performance is possible, possibly more. That would bring Fury X LL test performance to better than the 32 core Xeon V3 configurations that currently have the speed record. At that point clearing an exponent per-day on a Fury is probably better than TF work.

Mark Rose 2016-01-30 01:43

[QUOTE=airsquirrels;424625]Summary, I can't see much headroom for more than a few more percentage points of improvement for mfakto. Since they are so similar in structure I believe mfaktc will be similar.[/quote]

I do know that Oliver keep tweaking mfaktc to eek out small percentage increases.

[quote]clLucas is an entirely different story, with pipeline stalls, cache misses, and extremely excessive kernel queuing and the associated overhead. This isn't msft's fault so much as the fact that clFFT was not really designed for this kind of work (EDIT: I'm referring to being called over and over again in a tight loop with other kernels running before and after.) I am fairly confident at this stage that at least a 2x improvement in performance is possible, possibly more. That would bring Fury X LL test performance to better than the 32 core Xeon V3 configurations that currently have the speed record. At that point clearing an exponent per-day on a Fury is probably better than TF work.[/QUOTE]

Have you looked at cudalucas? Merely curious what your observations are.

airsquirrels 2016-01-30 01:54

[QUOTE=Mark Rose;424627]I do know that Oliver keep tweaking mfaktc to eek out small percentage increases. [/QUOTE]

This is exactly why there is not much headroom left. Very capable hands have already been working on these problems, all I can lend is some more hours.

[QUOTE=Mark Rose;424627]
Have you looked at cudalucas? Merely curious what your observations are.[/QUOTE]

I have however I am not as well versed in the CUDA side of things. I have also never seen source published for cuFFT (clFFT is open source), so the ability to tweak things to our specific application and avoid queuing several kernels per iteration is either not there or not as easy (for me) to do. The comments in the CUDALucas code suggest quite a bit of time was spent profiling and tuning the current implementation. My focus with CUDALucas has been on making it run more effectively on dual GPU cards such as the Titan Zs I have, which is a bit of a niche use case that sprung out of wanting to do a faster GPU validation run for M49*.

ATH 2016-01-30 02:09

[QUOTE=airsquirrels;424625] I am fairly confident at this stage that at least a 2x improvement in performance is possible, possibly more. That would bring Fury X LL test performance to better than the 32 core Xeon V3 configurations that currently have the speed record. At that point clearing an exponent per-day on a Fury is probably better than TF work.[/QUOTE]

Even with DP performance of 1/16th of the SP performance it could still possibly beat a Xeon?

airsquirrels 2016-01-30 02:21

[QUOTE=ATH;424631]Even with DP performance of 1/16th of the SP performance it could still possibly beat a Xeon?[/QUOTE]

The theoretical peak DP for the 2698v3 is around 600 GFLOP, so 1.2 TFLOPS between two of them. A FirePro W9100 is 2.6 TFLOPS DP, so it's good for more than twice the potential power of the Xeon system.

The Fury X has 537 TFLOPS DP which is similar to one of the Xeons, however the memory bandwidth is the kicker. 500 GB/s on the Fury vs. 48 GB/s or so per Xeon.


These are all theoretical numbers, but it does give an idea of how the two compare.

airsquirrels 2016-01-30 02:50

Replying to myself, because I wanted to run the numbers.

Estimating what we need to verify M49 in 24 hours:

1.16ms/ iteration

A complex FFT of 4M requires about 5 * N log2(N) DP FLOPs,
We need that twice plus about 4*N operations for point multiplication and normalization.

That works out to 940M or so ops per iteration, or 810 GFLOPs

The memory access requirements are a bit more nebulous to compute but at the very least we need read and write to 4M complex values each iteration, which is 55GB/s

So the FirePro, if it could magically line everything up, has the raw throughput. The Dual Xeons are memory bound, and the Fury is ALU bound.

Madpoo 2016-01-30 04:35

[QUOTE=airsquirrels;424636]So the FirePro, if it could magically line everything up, has the raw throughput. The Dual Xeons are memory bound, and the Fury is ALU bound.[/QUOTE]

I managed to mostly figure out that the Nvidia Pascal will have 1/3 DP performance, if people are speculating correctly.

That might put it's DP performance somewhere in the 2-3 TFLOPs range although I saw some reports it was rated over 4.

The memory bandwidth, some reports are saying as much as 1 TB/s...

I take all of that with a heavy dose of salt since the rumor mill is based on who knows what, but whatever the case, it does sound pretty intriguing.

I know it's a sucky thing to look at a performance problem, shrug your shoulders, and throw more hardware at it, especially when the software side has apparent room for improvement.

Prime95 2016-01-30 04:44

[QUOTE=airsquirrels;424636]
The memory access requirements are a bit more nebulous to compute but at the very least we need read and write to 4M complex values each iteration, which is 55GB/s[/QUOTE]

Prime95 requires 2 r/w passes over memory.

Prime95 does some of the forward FFT, point-wise squaring, and inverse FFT while data is in memory, clLucas does not. At best, you are going to need 2 r/w for the forward FFT, 1 r/w for the squaring, 2 r/w for the inverse FFT, 1 r/w for the rounding and carry propagation.

Prime95 uses a 256KB L2 cache which CUDA cards don't have, I assume AMD doesn't either. Consequently, I expect the 2 r/w in the forward and inverse FFT is optimistic -- probably 4 r/w is more realistic. You'd need to look at the clFFT code to know for sure.

frmky 2016-01-30 06:33

I'm not sure about the consumer cards, but the Tesla K20 has 1.25 MB L2 cache and the K20x and K40 has 1.5 MB L2.

airsquirrels 2016-01-30 14:00

[QUOTE=Prime95;424648]Prime95 requires 2 r/w passes over memory.

Prime95 does some of the forward FFT, point-wise squaring, and inverse FFT while data is in memory, clLucas does not. At best, you are going to need 2 r/w for the forward FFT, 1 r/w for the squaring, 2 r/w for the inverse FFT, 1 r/w for the rounding and carry propagation.

Prime95 uses a 256KB L2 cache which CUDA cards don't have, I assume AMD doesn't either. Consequently, I expect the 2 r/w in the forward and inverse FFT is optimistic -- probably 4 r/w is more realistic. You'd need to look at the clFFT code to know for sure.[/QUOTE]

Thanks for the detailed information!

The AMD cards are a bit better equipped than we have been discussing in terms of cache, of course we throw almost all of this out between each kernel call currently.

There is a 2MB L2 cache shared between all compute units. There is also a small GDS shared between all compute units of 32KB pages, a L1 cache per CU, and 64KB LDS for each 64 ALUs, and most importantly each of the 64 CUs has a relatively huge number of vector registers, 256KiB worth per CU with 2KB of Scalar registers.

As one more bonus for a carefully tuned kernel, their are basic integer ops (add, compare/swap) and reordering ops built into the LDS which can run fully independent of the VALUs.

ATH 2016-01-30 18:16

[QUOTE=airsquirrels;424636]Estimating what we need to verify M49 in 24 hours:

1.16ms/ iteration

A complex FFT of 4M requires about 5 * N log2(N) DP FLOPs,
We need that twice plus about 4*N operations for point multiplication and normalization.

That works out to 940M or so ops per iteration, or 810 GFLOPs

The memory access requirements are a bit more nebulous to compute but at the very least we need read and write to 4M complex values each iteration, which is 55GB/s

So the FirePro, if it could magically line everything up, has the raw throughput. The Dual Xeons are memory bound, and the Fury is ALU bound.[/QUOTE]

Titan Black has a theoretical DP performance of 1700 GFLOPS and 336 GB/s memory bandwidth but it still takes 55-56 hours for M49.

airsquirrels 2016-01-30 18:21

[QUOTE=ATH;424702]Titan Black has a theoretical DP performance of 1700 GFLOPS and 336 GB/s memory bandwidth but it still takes 55-56 hours for M49.[/QUOTE]

Correct, realized performance is closer to 200 GFLOP due to FFT implementations inefficiencies. My statements were to point out that clFFT is not achieving the same level of performance as cuFFT despite having similar HW. clFFT is much younger and has more room for optimizations.

chalsall 2016-01-30 19:13

[QUOTE=airsquirrels;424619]Or will we just trial factor to infinity (and beyond)?[/QUOTE]

That probably wouldn't (optimally) Make Sense [SUP](TM)[/SUP].

There will always be a need for additional GPU TF'ing before the LL'ing wavefronts. But at some point (read: probably in about eight months or so) we'll be far enough ahead in the TF'ing that most GPU efforts should go to DC'ing and/or LL'ing.

Of course, this all depends on the optimal economic cross-over points for LL'ing vs. TF'ing. These points have already changed several times as the GPU codes where optimized for different GPUs and different worktypes.

manfred4 2016-02-16 16:36

3735 Assignments or about 65 THzd left for completion which all have been assigned approx. 20 days ago to ANONYMOUS - this will be finished on the next drop off of his completed assignments next friday or maybe the week after - then all the DCTF is done!

What do you think will he do after he has finished? Help in the 100M digit range or on the LLTF front? Or continue on what he did before?

I also have a theory on what that guy was doing before he started DCTF: Before that time there was an Anonymous user, who submitted a very large amount of TF work in the 875M range ([URL="http://www.mersenne.ca/status/tf/0/0/3/87000"]here[/URL]) and stopped doing so at about the time this anonymous user started DCTF - that's why I think he was there a long time before doing very high range work, that nobody else cares about right now.

airsquirrels 2016-02-20 16:11

77 candidates left!

Looks like tomorrow, if Anonymous posts results like he/she has on the weekends in the past. We many finally get the answer to whether Anonymous will help with LLTF or move on to some other work.

LaurV 2016-02-20 16:16

Or Chris can bring in the 62M to 74, they are just 2000 of them, then this DCTF is RIP forever... (well, till next better hardware will allow us few bits more... :razz:)

chalsall 2016-02-20 16:41

[QUOTE=airsquirrels;426956]We many finally get the answer to whether Anonymous will help with LLTF or move on to some other work.[/QUOTE]

I *really* hope he does. We still need some LLTF'ing love in order to keep feeding the P-1'ers and the Cat 4 churners. For the latter things should lighten up in about a month (when the assignments given out after the MP announcement start expiring in quantity).

But even now, for LL Cat 4 there are approximately 240 assigned a day, with only 75 being completed a day.

chalsall 2016-02-20 16:43

[QUOTE=LaurV;426957]Or Chris can bring in the 62M to 74, they are just 2000 of them, then this DCTF is RIP forever... (well, till next better hardware will allow us few bits more... :razz:)[/QUOTE]

I've asked before if people want to take the 2,273 DC candidates in 62M from 73 to 74, but I didn't really get an answer.

LaurV 2016-02-20 16:51

[QUOTE=chalsall;426960]I've asked before if people want to take the 2,273 DC candidates in 62M from 73 to 74, but I didn't really get an answer.[/QUOTE]
Ye like my aunt, she always baked delicious cookies, but when we were visiting she was asking if we want any. And we being polite, were looking to our mother and were always saying no. Then auntie was upset we didn't like her cookies. And my mother telling her, "don't ask, just put them on the table, the little rascals will take them"... Guess who was right, and how many cookies were left on the table wen we went home... :razz:

airsquirrels 2016-02-20 17:01

[QUOTE=chalsall;426960]I've asked before if people want to take the 2,273 DC candidates in 62M from 73 to 74, but I didn't really get an answer.[/QUOTE]

I'll help do them, I was only trying to get LLTF a nice lead in all categories first. If we are going to haggle over whether DCTF is done till they are complete though...

chalsall 2016-02-20 17:26

All DCTF below 60M has been completed!!!
 
Congratulations everyone!

The last of all DC candidates below 60M have now been appropriate Trial Factored! For a side project of a side project which was expected to take ten years or so, this has been finished in a remarkable quick amount of time, particularly considering the factoring depth has increased twice since we began.

As requested, I am currently bringing in the 2,279 candidates at 73 bits in the 62M range for those interested in taking them up to 74.

I would ask, however, that people not move too much firepower from LLTF to do this work. We are still quite tight in the LL Cat 4 range, and feeding the LL P-1'ers. It is also going to be something like eight years before candidates in the 62M range will be assigned for DC'ing (heck, we just finished the LL'ing there).

Anyway, as always, thanks for all the cycles everyone!!! :tu:

chalsall 2016-02-20 18:46

So, [URL="https://www.gpu72.com/reports/workers/dctf/day/"]some were interested[/URL] in additional DCTF... :smile:

Just in case there are any others who really just _need_ to do additional DCTF (for whatever reason; no judgement... your kit), I've brought in an additional 1,000 candidates in 61M. I'll keep filling this range as these are consumed.

Also, a note to those using MISFIT... A few are still using the DCTF request parameter. Thus, you'll be given 61M to 74 work. If you don't want to do this, please change your settings to LLTF (where "to 72 bits" is still on sale! :wink:).

Mark Rose 2016-02-20 23:45

[QUOTE=chalsall;426966]Congratulations everyone!

The last of all DC candidates below 60M have now been appropriate Trial Factored![/QUOTE]

And all below 105M if you are using a GTX 580 :)

chalsall 2016-02-20 23:53

[QUOTE=chalsall;426969]...I've brought in an additional 1,000 candidates in 61M. I'll keep filling this range as these are consumed.[/QUOTE]

So, I get back from the "Hash" and discover the 1,000 DCTF candidates I'd tasked "Spidy" with fetching in 61M are already gone...

Sigh...

OK, I'm now going to bring in a few more thousand candidates just to satiate the appetite. Hopefully they don't all get assigned overnight....

airsquirrels 2016-02-21 00:39

[QUOTE=chalsall;426981]So, I get back from the "Hash" and discover the 1,000 DCTF candidates I'd tasked "Spidy" with fetching in 61M are already gone...

Sigh...

OK, I'm now going to bring in a few more thousand candidates just to satiate the appetite. Hopefully they don't all get assigned overnight....[/QUOTE]

Well see now you've caused me trouble, I need to pull together at least 10k GhzDay back from my DCLL to DCTFP2 to preserve the lead. The whole plan was to hold the #1 GPU72 spot for DCTF when it died :)

Mark Rose 2016-02-21 00:46

Darn it, I might have to stop LLTF for a bit to keep ahead of LaurV hehe

chalsall 2016-02-21 01:07

[QUOTE=Mark Rose;426984]Darn it, I might have to stop LLTF for a bit to keep ahead of LaurV hehe[/QUOTE]

Ah... Man... As Aaron says, this is why we can't have nice stuff... :wink:

You guys do whatever you want to keep your respective rankings. And the additional DCTF /does/ make sense (for some cards) at the moment.

The DCTF work done in the near future will be valuable. In about eight years....

Mark Rose 2016-02-21 03:16

[QUOTE=chalsall;426985]Ah... Man... As Aaron says, this is why we can't have nice stuff... :wink:

You guys do whatever you want to keep your respective rankings. And the additional DCTF /does/ make sense (for some cards) at the moment.

The DCTF work done in the near future will be valuable. In about eight years....[/QUOTE]

I'll stick to LLTF. I really don't care about the rankings. :)

airsquirrels 2016-02-21 04:21

[QUOTE=Mark Rose;426995]I'll stick to LLTF. I really don't care about the rankings. :)[/QUOTE]

Nobody REALLY cares about the rankings that is in this project for the long term. We all just need something to justify our arbitrary allocation of hardware/time/money/power/etc. and reasonably achievable short term goals provide justification :)

airsquirrels 2016-02-21 04:44

[QUOTE=chalsall;426985]... In about eight years....[/QUOTE]

Well, the goal for some of us is to shorten that number up... I still have ambitions of getting early-error-detection style DC into the project. That might actually happen in about... eight years...

Mark Rose 2016-02-25 19:28

What cards are people extending DCTF with?

If anyone is interested, I can create lists of assignments above 69M that could use more DCTF appropriate to those cards, based on the crossover points on mersenne.ca

airsquirrels 2016-02-25 19:39

[QUOTE=Mark Rose;427410]What cards are people extending DCTF with?

If anyone is interested, I can create lists of assignments above 69M that could use more DCTF appropriate to those cards, based on the crossover points on mersenne.ca[/QUOTE]

Well right now it's going to be my Titans not on Water to avoid blowing MOSFETs and a collection of the Fury X's. After the DCTF part 2 push the Fury cards will go back to DCLL and hopefully get those Titans on water.

chalsall 2016-02-25 21:08

[QUOTE=Mark Rose;427410]If anyone is interested, I can create lists of assignments above 69M that could use more DCTF appropriate to those cards, based on the crossover points on mersenne.ca[/QUOTE]

While I very much appreciate your clearing out the higher DCTF work, might I suggest we put off doing any additional such work? It's going to be something like 15 years before such candidates get handed out!

Over that time I suspect the optimal curves will change several times. :smile:

Mark Rose 2016-02-25 23:04

[QUOTE=chalsall;427423]While I very much appreciate your clearing out the higher DCTF work, might I suggest we put off doing any additional such work? It's going to be something like 15 years before such candidates get handed out!

Over that time I suspect the optimal curves will change several times. :smile:[/QUOTE]

True. But I look at how eagerly the high assignments you pulled in were gobbled up lol

airsquirrels 2016-02-26 01:54

[QUOTE=Mark Rose;427436]True. But I look at how eagerly the high assignments you pulled in were gobbled up lol[/QUOTE]

I know I pulled them in out of principle, they had been on the DC Current Trial Factored level report for a while even if they weren't on the estimated completion report. My thought is that the Appropriate Level is so dynamic that it is difficult to use as a target unless you're actually doing the TF and DC for the exponent yourself on the same card.

airsquirrels 2016-03-02 19:56

Did we move the goalpost again?

chalsall 2016-03-02 19:59

[QUOTE=airsquirrels;427946]Did we move the goalpost again?[/QUOTE]

Not that I'm aware of. Could you elaborate?

airsquirrels 2016-03-03 23:33

It looked like another chunk of factors in the 61M range appeared, but perhaps it is just my memory failing. It has been a very busy week in the real world....

petrw1 2016-03-07 16:28

Update 21 ... March 7, 2016 (Maybe the penultimate)
 
Sorry I was AWOL in February .... no February stats....

1,173,910 GhzDays in the last month.
7 different contributors
488 Factors found (Interestingly this is getting smaller as the GhzDays effort remains the same)
59,203 P1/LL/DC work saved

3 contributors currently have assignments
3,448 Assignments out. (These should be the last)

14 estimated days to completion (I had to guess...GPU72 says none)
March 21, 2016 ....Spring is in the air????

MILESTONES
Complete to 60M.
Complete to 73 Bits.
All exponents assigned.

chalsall 2016-03-07 20:22

[QUOTE=petrw1;428296]14 estimated days to completion (I had to guess...GPU72 says none)[/QUOTE]

Monst has a few, and Anonymous has a few more.

Yeah, two weeks out seems reasonable.

Look forward to putting this to bed.

chalsall 2016-03-18 23:01

DCTF is finally "pining for the fjords"...
 
Just so everyone knows, AnonUser has just submitted the last batch of DCTF work!

Thanks for all the cycles everyone. This was a multi-year effort by many, many people! :tu:

manfred4 2016-03-18 23:08

And that user switched back to his previous TF work to bring 875M to 80 bits, he reported the next 7 factors in that range again. [URL="http://www.mersenne.org/report_exponent/?exp_lo=875159837&full=1"]here[/URL].

Too sad he doesn't continue with the work valueable for the upcoming future.

Uncwilly 2016-03-18 23:49

[QUOTE=manfred4;429545]Too sad he doesn't continue with the work valueable for the upcoming future.[/QUOTE]Like helping to take all of the first 4000 exponents in the 100M digit range up to a suitable level. :uncwilly:

Mark Rose 2016-03-19 02:28

[QUOTE=manfred4;429545]And that user switched back to his previous TF work to bring 875M to 80 bits, he reported the next 7 factors in that range again. [URL="http://www.mersenne.org/report_exponent/?exp_lo=875159837&full=1"]here[/URL].[/QUOTE]

That range is so arbitrary. I wonder what the reason could be.

airsquirrels 2016-03-20 01:22

I am very glad DCTF is done. If I could only get consistent temperatures in Ohio I would be contributing a lot more to LLTF, or maybe 100M.

Right now I've actually been spending some effort just-in-time factoring my own 100M candidates...

At least Anonymous is still contributing to the project :)

chalsall 2016-03-20 18:30

[QUOTE=airsquirrels;429650]If I could only get consistent temperatures in Ohio I would be contributing a lot more to LLTF, or maybe 100M.[/QUOTE]

Damn that climate change!!!

lycorn 2016-03-20 18:57

[QUOTE=Uncwilly;429548]Like helping to take all of the first 4000 exponents in the 100M digit range up to a suitable level. :uncwilly:[/QUOTE]

Or help clearing the remaining few @ 62 & 63 bits in the 0 - 1 M range...:whistle:

Mark Rose 2016-03-20 21:02

[QUOTE=lycorn;429682]Or help clearing the remaining few @ 62 & 63 bits in the 0 - 1 M range...:whistle:[/QUOTE]

Those are all between 0 and 100 k, outside of the GPU range.

lycorn 2016-03-21 13:11

Depends on the mfaktc version you use. Older versions can go down to to 2K.
Not very efficient, but still way more efficient than CPU TFing.

As John Lennon would say: "you may say I´m a lunatic, but I´m not the only one..." :smile:

petrw1 2016-03-21 15:09

[QUOTE=lycorn;429731]Depends on the mfaktc version you use. Older versions can go down to to 2K.
Not very efficient, but still way more efficient than CPU TFing.

As John Lennon would say: "you may say I´m a lunatic, but I´m not the only one..." :smile:[/QUOTE]

And Billy Joel:

[QUOTE]You may be right
I may be crazy
But it just may be a lunatic you're looking for[/QUOTE]

Gordon 2016-03-21 19:41

[QUOTE=petrw1;429734]And Billy Joel:[/QUOTE]

[QUOTE=lycorn;429731]Depends on the mfaktc version you use. Older versions can go down to to 2K.
Not very efficient, but still way more efficient than CPU TFing.

[/QUOTE]

Do you have a copy of that old version anywhere...

petrw1 2016-03-21 19:44

[QUOTE=Gordon;429746]Do you have a copy of that old version anywhere...[/QUOTE]

Maybe here?

[url]http://www.mersenneforum.org/mfaktc/[/url]

petrw1 2016-03-21 22:20

[QUOTE=Mark Rose;429691]Those are all between 0 and 100 k, outside of the GPU range.[/QUOTE]

Some might say: "WHY!!!!??? ECM progress is WAYYYY beyond 64 bits for this range."

Some might say: "Where does it end?" Because, if and when these are completed to 64 Bits the next range above may be at 65 or 66 bits.

That being, said I am one of the few that has a few old cores clunking away at this range.

petrw1 2016-09-12 19:48

We are a LOOOONG way off but...
 
This tells me that some time prior to DC getting there we need to TF:

[url]http://www.mersenne.ca/status/tf/0/0/3/6000[/url]
About 22,000 66-69M from 74-75 bits

[url]http://www.mersenne.ca/status/tf/0/0/3/7000[/url]
About 2,500 70-71M from 74-75 bits.


All times are UTC. The time now is 08:00.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.