mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

mdettweiler 2010-07-27 03:37

[quote=msft;222993]Hi, mdettweiler
If i can see LLR/FFTW/x86 code, I can "try" convert CUDA.
But It is not illegal ?
[URL="http://en.wikipedia.org/wiki/Homesteading_the_Noosphere"]"Homesteading the Noosphere"[/URL] is my Favorite text.:smile:[/quote]
LLR is an open-source program itself, so using its code as a starting point shouldn't be a problem. I am not sure what the exact license used is; it may be GPL. At any rate, from Jean's various comments around the forum he would surely not mind you using his code from which to study how to implement the LLR algorithm.

LLR's home page is [URL]http://jpenne.free.fr/index2.html[/URL]; from there, you can download the [URL="http://jpenne.free.fr/llr3/llr381src.zip"]source code[/URL] for LLR 3.8.1 (CPU, gwnum-based), and from Jean's [URL="http://jpenne.free.fr/Development/"]development page[/URL] you can get the not-yet-complete FFTW version of the code. (The files to download from there are llrpsrc.zip and llrpisrc.zip; they both seem to be based on FFTW, but I'm not sure which the more complete version or what the differences are between them.)

I'm not sure just how close the FFTW version of LLR is to being ready for actual use. It may not be as yet suitable for a direct conversion to CUFFTW. What I was thinking was that the easiest route would be to take the existing MacLucasFFTW CUDA application, and apply the LL>LLR algorithm modifications (relatively minor as they are) to that. Of course, you're more familiar with your code and the algorithms than I, so I may not be fully understanding the extent of the modifications here. I do know, though, that Jean's LLR was based directly on George Woltman's Prime95 LL testing program; so it definitely is possible to take an existing LL program and convert it to LLR.

Thanks,
Max :smile:

Oddball 2010-07-27 04:25

[QUOTE=mdettweiler;222992]
msft, would you by chance be willing to port this application to the [url=http://en.wikipedia.org/wiki/Lucas%E2%80%93Lehmer%E2%80%93Riesel_test]LLR algorithm[/url], as I initially inquired back in [url=http://www.mersenneforum.org/showthread.php?t=12576&page=4#post218207]post #177[/url]?
...
Your efforts in developing this LL application are greatly appreciated, and even more so if you can help in porting it to LLR! :smile:
[/QUOTE]
The development of an LLR application for GPUs may have unintended consequences.

If a mid-range GPU can outperform a whole bunch of high end quad cores by a large margin, it's likely that those without good GPUs would lose interest and quit. It would be nearly impossible to get onto the top 5000 list without a good GPU, and available LLR ranges may have ridiculously long testing times (since the low n ranges would all have been completed by GPUs). If that were to happen, it would potentially drive away many contributors, leading to a sharp drop in output if the small number of people who have a GPU farm lose interest. The way things are now, most of the major projects have enough participants that a project wouldn't be significantly affected if some people leave.

A similar example can be seen in the Folding@home project. In 2008, Playstation 3's were, for the first time, able to contribute to that project. IIRC, Folding@home reached 3 (native) PetaFlops in summer 2008, 4 PetaFlops in fall 2008, and 5 PetaFlops in early 2009. The popularity of Playstation 3's went down shortly after, and today, Folding@home is back below 3 PetaFlops. It is too early to tell how much further that project's total processing power will decline.

Be careful what you wish for...

Oddball 2010-07-27 04:42

Just to point out one more thing:

Operation Billion Digits has been around for several years, and was making steady progress until GPUs started contributing a couple of weeks ago. While there is quite a large jump in progress now, it has come at a price: it may no longer be possible for slower machines to contribute in any meaningful way.

From:

[url]http://www.mersenneforum.org/showpost.php?p=218755&postcount=402[/url]

[quote]Luigi, what about opening a new part of the range and mark it "GPU's Stay Out!"? The work from 60 to 70 would be nice for the tired old Pentii.[/quote]

Here's another one:

[url]http://www.mersenneforum.org/showpost.php?p=219366&postcount=2[/url]

[quote]
It might be nice if there could be a range reserved for older machines, say Pentium II and older/slower.[/quote]

mdettweiler 2010-07-27 05:24

@Oddball: yeah, good points. What we'll probably do if we can get a GPU LLR application at NPLB is request that people only use it for certain ranges, sort of like what OBD is doing; the ranges that I'm thinking it would be ideal in are the 11th Drive (relatively small tests near the bottom of the top-5000 list--that threshold is being moved up rather quickly by PrimeGrid's hordes of computers anyway, so adding some GPUs at NPLB would only help us keep up better), and on the other end of the spectrum with our k=300-400 mini-drive, which covers n>1M tests searching for megabit primes. The primes should be sufficiently far and few from that search that GPUs' speeding up their discovery shouldn't have a large impact on the top-5000. Additionally, we have tons of sub-top-5000 search space that needs to be filled in for the purpose of completeness, and GPUs would be great at plowing through that stuff (which is rather unattractive to many participants due to its lack of particularly tangible returns).

Now if PrimeGrid got wind of the GPU LLR app and started utilizing it via BOINC on their huge Proth efforts, *then* we might have a problem. :smile: Due to their already immense firepower they are the primary driving force behind the upward motion of the top-5000 threshold, so adding tons of GPUs to that would make everything go completely haywire.

That said, as much as we'd like for older CPUs to still be able to contribute meaningfully, since after all these projects are meant to be fun, we also don't want to lose sight of our overall goal of extending the contiguously-searched blocks of k and n for Riesel primes as far as possible. That is, surely, the main reason why this is all worth doing: the more of a search space we cover, the more data is available to researchers who can hopefully, eventually, find some clues to why prime numbers are where they are. So we don't want to hold back progress (the end) for the express purpose of making it easier for anybody their very own top-5000 prime even with modest hardware (which is definitely a good thing to have, but nonetheless is only the means to an end).

I think as long as we keep GPUs primarily limited to the search regions where they can be most useful with the least adverse effects on the dynamics of prime searching (such as I described above), we should be able to maximize their overall net contribution to the prime search world.

msft 2010-07-27 05:49

Thank you, mdettweiler
I read source 10 minits every day before goto bed, Go to Sleep Fast.:sleep:

ET_ 2010-07-27 08:51

[QUOTE=Oddball;223012]Just to point out one more thing:

Operation Billion Digits has been around for several years, and was making steady progress until GPUs started contributing a couple of weeks ago. While there is quite a large jump in progress now, it has come at a price: it may no longer be possible for slower machines to contribute in any meaningful way.

From:

[url]http://www.mersenneforum.org/showpost.php?p=218755&postcount=402[/url]



Here's another one:

[url]http://www.mersenneforum.org/showpost.php?p=219366&postcount=2[/url][/QUOTE]

I'd like to add that OBD double-checked all exponents with a completely different hardware (GPU) in less than a week...

Luigi

rogue 2010-07-27 12:42

What about people burning coal to keep their old PIIs and PIIIs running on projects? It is their choice, but using these old computers requires an inordinate amount of power for what they are capable of. I wouldn't be sad to see many of those old computers get recycled.

Another unintended consequence of people switching work over to GPUs is that it could ultimately hurt Intel's dominance. Why would someone want to pay hundreds to buy a new computer when plopping in a new graphics card gives them a lot more bang for the buck?

Oddball 2010-07-27 18:16

[QUOTE=rogue;223046]What about people burning coal to keep their old PIIs and PIIIs running on projects? It is their choice, but using these old computers requires an inordinate amount of power for what they are capable of.[/QUOTE]
Most graphics cards consume a lot more power than a Pentium III. The max TDP of an 866 MHz Pentium III is only 26 watts:

[url]http://ark.intel.com/Product.aspx?id=27555[/url]

while high end graphics cards consume several hundred watts.

axn 2010-07-27 18:28

[QUOTE=Oddball;223074]Most graphics cards consume a lot more power than a Pentium III. The max TDP of an 866 MHz Pentium III is only 26 watts:

[url]http://ark.intel.com/Product.aspx?id=27555[/url]

while high end graphics cards consume several hundred watts.[/QUOTE]

1. Intel's quoted figures for TDP are an underestimate of max power consumption. Doesn't really matter, but it is probably something like 30-35W.

2. It is not just the chip's power consumption, but the whole system's power consumption. And in that respect, the ratios would be much smaller.

3. But 1&2 are not even the real points. If the computation that a P3 performs in a year can be accomplished by a Graphics card in a day, which would you use? "for what they are capable of" is the key. And newer technology will beat the crap out of older technology, in that respect.

Oddball 2010-07-27 18:58

[QUOTE=axn;223075]If the computation that a P3 performs in a year can be accomplished by a Graphics card in a day, which would you use? [/QUOTE]
That argument only applies to projects with a fixed end date (17 or bust, for example). For open ended projects like GIMPS, someone with a pentium III would consume 1 unit of coal each day. But if that person were to get a GPU instead, his consumption would end up being something like 10 units of coal per day.

axn 2010-07-27 19:30

[QUOTE=Oddball;223078]That argument only applies to projects with a fixed end date (17 or bust, for example). For open ended projects like GIMPS, someone with a pentium III would consume 1 unit of coal each day. But if that person were to get a GPU instead, his consumption would end up being something like 10 units of coal per day.[/QUOTE]

To see the absurdity of that argument, take it to its logical extreme -- a switched off computer will be the ideal, since it consumes 0 units of coal/day.

Per-day consumption is an absurd measure of efficiency for distributed computing.

Also, the person could just run the GPU 1/10th of a day, achieve the same per-day consumption of power as the P3, but much more computation. Would that be better?

EDIT:- Or better yet, replace 10 P3 in the project with one GPU, and we're ahead in thruput without increasing power comsumption. Win-win!


All times are UTC. The time now is 22:00.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.