mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   LL with OpenCL (https://www.mersenneforum.org/showthread.php?t=18297)

kracker 2013-09-25 14:35

Hate to break it to you, but LL tests are [I]very unscalable, [/I]and bruteforce style will not work. And I can just imagine the latency of lan/vs PCI-E... LL tests aren't something distributable (etc half of work on one gpu and half on another) each iter depends on the previous one.

Robish 2013-09-25 14:36

[QUOTE=Karl M Johnson;354089]Multi-gpu support should be added first then.
It's not an easy undertaking, even if a single thread controls all the gpus.
It makes more sense to add multi-gpu support to CUDALucas/clLucas, since it would ideally cut the LL test time by the number of GPUs in the system (if they are the same and versus a single gpu).
Once that is done, merging several rigs and assigning a single exponent to them could be considered.[/QUOTE]


That's my thinking exactly, I agree it probably would be difficult (way beyond me) but Atom from OCLHashcat has done something similar so there's a starting point for an approach. and the benefits could be enormous.

Robish 2013-09-25 14:39

[QUOTE=kracker;354108]Hate to break it to you, but LL tests are [I]very unscalable, [/I]and bruteforce style will not work. And I can just imagine the latency of lan/vs PCI-E... LL tests aren't something distributable (etc half of work on one gpu and half on another) each iter depends on the previous one.[/QUOTE]


Point taken, I just thought the discussion was worth exploring

VictordeHolland 2013-09-25 14:41

It's not all about computing power, memory bandwidth also plays a big role. PCI-E is quite fast, but not as fast as the GPU memory. Connecting >3 GPUs will also require very expensive motherboards and infinityband (or similar). Running one task on each GPU is probably going to be much more efficiƫnt and less expensive.

Robish 2013-09-25 19:25

[QUOTE=Robish;354110]Point taken, I just thought the discussion was worth exploring[/QUOTE]

Its just when you read articles like this you wonder if things can advance - dramatically.

[URL]http://arstechnica.com/security/2012/12/25-gpu-cluster-cracks-every-standard-windows-password-in-6-hours/[/URL]

"350 billion-guess-per-second " I mean...WOW that's powerful. But I take on board Krackers' comments. I don't know enough about LL to even argue ;-)

kracker 2013-09-25 19:30

[QUOTE=Robish;354165]Its just when you read articles like this you wonder if things can advance - dramatically.

[URL]http://arstechnica.com/security/2012/12/25-gpu-cluster-cracks-every-standard-windows-password-in-6-hours/[/URL]

"350 billion-guess-per-second " I mean...WOW that's powerful. But I take on board Krackers' comments. I don't know enough about LL to even argue ;-)[/QUOTE]

Well, it's not like LL tests take a humongous time to do these days, anyways. :smile:

[SIZE=1](Wish I had a GPU setup like that)[/SIZE]

Robish 2013-09-25 19:31

[QUOTE=VictordeHolland;354111]It's not all about computing power, memory bandwidth also plays a big role. PCI-E is quite fast, but not as fast as the GPU memory. Connecting >3 GPUs will also require very expensive motherboards and infinityband (or similar). Running one task on each GPU is probably going to be much more efficiƫnt and less expensive.[/QUOTE]


You're right. Infiniband is what Gosney uses on his rig. 10 Gb/sec. However he says he bought all cards connectors and switch for $800 on ebay (used) - less than the price of a single high end GPU.

okay I'll shut up about it now, just thought it worth mentioning.. :-)

Robish 2013-09-25 19:36

[QUOTE=kracker;354166]Well, it's not like LL tests take a humongous time to do these days, anyways. :smile:

[SIZE=1](Wish I had a GPU setup like that)[/SIZE][/QUOTE]


:-) i'm only a newbie here Kracker, did it take longer a few years ago? I suppose it had to of. Older chips n such.

(trying to build a smaller scale one) Learning Linux is tough (no windows support) but I'm getting there :-)

Robish 2013-09-25 21:53

[QUOTE=Robish;354168]:-) i'm only a newbie here Kracker, did it take longer a few years ago? I suppose it had to of. Older chips n such.

(trying to build a smaller scale one) Learning Linux is tough (no windows support) but I'm getting there :-)[/QUOTE]


I'm sorry did I get distracted....Breaking Bad ROCKS!!!! :-) hic ahem sorry eh er ahem cough, bit drunk right now hmmm ah ill be ok tomorrow and do more LLs :-)

Ps prime searching is fun is it not? Whiskey however......more. ;-)

kracker 2013-09-26 02:13

@msft: by the way, is there a "worktodo.txt" in clLucas right now?

Karl M Johnson 2013-09-26 06:59

[QUOTE=kracker;354108]Hate to break it to you, but LL tests are [I]very unscalable.[/I].. LL tests aren't something distributable (etc half of work on one gpu and half on another) each iter depends on the previous one.[/QUOTE]
Indeed.
Mlucas' parallel code doesn't scale well beyond 4 cores, but there's still an increase in speed versus 4 cores.
There will be no ideal 300% increase in speed for, say, 4 GTX Titans or 4 7970s, but the status of a single potential Mersenne number will surely be known sooner (versus a single GPU).
Besides, the availability of several gaming GPUs inside a single machine is higher than of several server class CPUs.
The idea has a chance to live.


All times are UTC. The time now is 22:00.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.