![]() |
|
|
#12 |
|
"Nathan"
Jul 2008
Maryland, USA
5·223 Posts |
OK, what I have is an 8-core system with no GPU and up to 32 cores and 4 GPUs (2 K20s and 2 K10s) available. 16GB RAM on the former and 32GB on the latter.
Uncwilly, I can P-1 your number if 32GB is enough RAM. Will it need any TF first? I will also try out CUDA P-1...is it better to DC a known P-1 result or is it ready for production work? The rep at the NVIDIA dealer providing the trial says that it *may* be possible to arrange future testing on machines with bigger RAM. They have systems with 1TB RAM that retail for $100,000. I think we need three of them... |
|
|
|
|
|
#13 |
|
6809 > 6502
"""""""""""""""""""
Aug 2003
101×103 Posts
265A16 Posts |
|
|
|
|
|
|
#14 |
|
"Carlos Pinho"
Oct 2011
Milton Keynes, UK
135316 Posts |
NBtarheel_33,
Will you be willing to make a benchmark test on your machine with msieve? Please look at this post http://www.mersenneforum.org/showpos...ostcount=7%29? If you are on windows you can substitute the msieve.exe binary with the one in here http://gilchrist.ca/jeff/factoring/; if you are under linux you have it here http://www.sendspace.com/file/aih4dw Run under windows start /low /min msieve.exe -v -nc -t 32 or under linux ./msieve -v -nc -t 32 Thank you in advance, Carlos PS( Benchmarks to compare in here http://www.mersenneforum.org/showthread.php?t=16348) Last fiddled with by pinhodecarlos on 2013-04-27 at 10:17 |
|
|
|
|
|
#15 |
|
"Nathan"
Jul 2008
Maryland, USA
5·223 Posts |
So, either I'm dumb or Linux is, but I'm not sure why it is so difficult to spit out how many physical cores are on a system, and whether or not hyperthreading is on, and indeed, whether or not "32 cores available" really just means 16 hyperthreaded physical cores... And so it goes...
I think I have finally figured out the available resources once and for all. I have three systems. The first has 16 non-hyperthreaded cores, 16 GB RAM, and no GPUs. The second has 16 hyperthreaded cores, 32 GB RAM, and two K20 GPUs. And the third has 16 hyperthreaded cores, 32 GB RAM, and two K10 GPUs. After several days of playing around with settings in mprime and trying to figure out just how in the hell /proc/cpuinfo counts and maps logical vs. physical CPUs, I believe that the best performance for running LLs comes by giving an entire CPU (i.e. 8 cores or 16 hyperthreads) to each LL. In the 50M range, this is yielding iteration times of 4.5ms or so, or 225,000 seconds = 2.6 days for the whole test. 30M doublechecks will complete in just under 24 hours. Suddenly what was a backlog of assignments is turning into not having enough to keep these beasts fed! On the GPUs - running CUDALucas - I have a 54M and a 56M running on the K20s @ 4.1ms/iteration...which isn't really too much better than an 8-threaded CPU LL test. On the K10s, I am running a 49M @ 9.8ms/iteration, and an 82M @ 14.3ms/iteration. Seems as though if you have big CPU power, it's not as efficient to LL on the GPU...maybe the GPUs are better used for TF or P-1. By the way, I think I grabbed up an old version of Prime95, because I have run into the dreaded huge roundoff bug. It doesn't actually damage the result, AFAICT, correct? |
|
|
|
|
|
#16 | |
|
"Nathan"
Jul 2008
Maryland, USA
5×223 Posts |
Quote:
. I will take it at least to 80, maybe 81, then try P-1 with 32GB RAM on at least 8 cores, if not 16. (Of course, I have no idea how long P-1 will take, but I have login privileges until May 18th...)
|
|
|
|
|
|
|
#17 | |
|
"Nathan"
Jul 2008
Maryland, USA
5·223 Posts |
Quote:
|
|
|
|
|
|
|
#18 |
|
"Nathan"
Jul 2008
Maryland, USA
5·223 Posts |
|
|
|
|
|
|
#19 |
|
"Nathan"
Jul 2008
Maryland, USA
45B16 Posts |
Unfortunately, there seems to only be the one master login node and two 16-core nodes that I can access. There doesn't seem to be a good way to put the two systems together to get a virtual 32-core system (this would be very nice, but I don't think they are going to go to all that trouble for a test cluster).
|
|
|
|
|
|
#20 |
|
"Nathan"
Jul 2008
Maryland, USA
5·223 Posts |
LOL, I wonder if they realized what kind of "Test Drive" we'd devise to put their systems through. I keep waiting for the "Hmm, you're going to have to scale back your demands on the system" e-mail...
|
|
|
|
|
|
#21 | |
|
Oct 2011
7×97 Posts |
Quote:
|
|
|
|
|
|
|
#22 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
165618 Posts |
Fixed in 27.9. All earlier v27s had the bug. You are correct in that it does not damage the result.
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| New GPU Compute System | airsquirrels | GPU Computing | 90 | 2017-12-08 00:13 |
| Analog hardware to compute FFT's... | WraithX | Hardware | 1 | 2012-11-28 13:29 |
| Doubled compute power for a day? | Christenson | PrimeNet | 19 | 2011-10-26 08:29 |
| New Compute Box | Christenson | Hardware | 0 | 2011-01-15 04:44 |
| My throughput does not compute... | petrw1 | Hardware | 9 | 2007-08-13 14:38 |