mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > Cloud Computing

Reply
 
Thread Tools
Old 2019-11-19, 07:31   #573
bayanne
 
bayanne's Avatar
 
"Tony Gott"
Aug 2002
Yell, Shetland, UK

22×83 Posts
Default

Great, thanks for that :)
bayanne is offline   Reply With Quote
Old 2019-11-19, 08:28   #574
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

100101100010112 Posts
Default

Quote:
Originally Posted by kracker View Post
Got a T4, decided to try gpuowl on it... expected results, others as reference.
Code:
gpuowl(PRP) 92M exponent - 5M FFT

Tesla K80:
4.68 ms/iter - 66.8 GHz/days
430 GHz/days (mfaktc)

Tesla T4:
5.96 ms/iter - 52.4 GHz/days
~1700 GHz/days (mfaktc)

Tesla P100:
1.17 ms/iter - 266 GHz/days
~1100 GHz/days (mfaktc)
Nice! Thanks for that! As stated before, both K80 and P100 are beasts for LL/PRP. Using them for TF is a waste. If you are lucky enough to get a T4, use it for TF (in this side of the world, we didn't see one for ages!)

Did you try two instances on K80? (if that is a dual-gpu card, the two chips may not communicate so fast with each-other, and "in the cloud" may be different from "locally").

Last fiddled with by LaurV on 2019-11-19 at 08:32
LaurV is offline   Reply With Quote
Old 2019-11-19, 10:32   #575
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

35·13 Posts
Default

CUDALucas also needs cufft installed:

!apt-get install -y cuda-cudart-10-0
!apt-get install -y cuda-cufft-dev-10-0
ATH is offline   Reply With Quote
Old 2019-11-19, 12:31   #576
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23·271 Posts
Default

Quote:
Originally Posted by LaurV View Post
Nice! Thanks for that! As stated before, both K80 and P100 are beasts for LL/PRP. Using them for TF is a waste. If you are lucky enough to get a T4, use it for TF (in this side of the world, we didn't see one for ages!)

Did you try two instances on K80? (if that is a dual-gpu card, the two chips may not communicate so fast with each-other, and "in the cloud" may be different from "locally").
I only have access to one K80 gpu "core" in an instance(so half of the physical card), though if you mean trying two instances of gpuowl on one gpu, I haven't tried that.
kracker is offline   Reply With Quote
Old 2019-11-19, 12:56   #577
bayanne
 
bayanne's Avatar
 
"Tony Gott"
Aug 2002
Yell, Shetland, UK

22×83 Posts
Default

I am finding now that exponents that have been trial factored are not being cleared from the 'worktodo' file, and are being repeat tested. I really am not sure what I can do to stop this. Anyone else finding anything similar happening?
bayanne is offline   Reply With Quote
Old 2019-11-19, 13:44   #578
bayanne
 
bayanne's Avatar
 
"Tony Gott"
Aug 2002
Yell, Shetland, UK

22·83 Posts
Default

Quote:
Originally Posted by bayanne View Post
An exponent that had been allocated to me 97930517 has been completed by someone else as well, and their result has been accepted. No problem to me, except that this result is not been cleared from the results.txt file, wherever that may be held. Thus it keeps appearing in the results for my instance name.

Where is that file, and can I clear this entry from it?
There are now 6 exponents that are stuck in the 'results' file.

How can I clear them out pleae?
bayanne is offline   Reply With Quote
Old 2019-11-19, 14:37   #579
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

100110000000112 Posts
Default

Quote:
Originally Posted by bayanne View Post
There are now 6 exponents that are stuck in the 'results' file. How can I clear them out pleae?
Could you please PM me an example?
chalsall is offline   Reply With Quote
Old 2019-11-19, 15:19   #580
ric
 
ric's Avatar
 
Jul 2004
Milan, Ita

2668 Posts
Default

Quote:
Originally Posted by ATH View Post
CUDALucas also needs cufft installed:

!apt-get install -y cuda-cudart-10-0
!apt-get install -y cuda-cufft-dev-10-0
... and the same holds true for CUDAPm1: after adding these two lines to the notebook, everything is fine.
ric is offline   Reply With Quote
Old 2019-11-19, 15:57   #581
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23×271 Posts
Default

Quote:
Originally Posted by ric View Post
... and the same holds true for CUDAPm1: after adding these two lines to the notebook, everything is fine.
That fixed it for me as well.

Also, I know I sound like an idiot, but how exactly do I use -cufftbench? Cudapm1 seems to be ignoring it... Nevermind, figured it out.

Last fiddled with by kracker on 2019-11-19 at 16:06
kracker is offline   Reply With Quote
Old 2019-11-19, 22:53   #582
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

35·13 Posts
Default

I did get some error messages that file or folder was locked when trying to install cudart and cufft in the main script, but you do not have to run the installations separately, adding 2 delays worked for me:

(reverse ssh code)
.
!sleep 30
!apt-get install -y cuda-cudart-10-0
!sleep 5
!apt-get install -y cuda-cufft-dev-10-0
.
(starting mprime+cudalucas)



Quote:
Originally Posted by kriesel View Post
Please PM me a session capture of the P100 problem. "buffer overflow detected" is not present in the CUDALucas bug and wish list.
I guess you can try using CUDALucas on K80 and gpuowl on P100 for now.
I've taken to using the following at the very front of Colab scripts, so I can decide whether to go with what the session got, or try again.
!lscpu
!nvidia-smi

I have seen CUDALucas run into problems when run locally, if the span of cufftbench or -threadbench is too large; too many fft lengths for the size of the program's arrays. Threadbench can be run in multiple subranges to avoid that issue.
It is an error from Linux not from CUDALucas, it never writes anything to my outputcudaluas.txt file:

Code:
*** buffer overflow detected ***: ./CUDALucas terminated
/bin/bash: line 1:  2509 Aborted                 (core dumped) ./CUDALucas >> outputcudalucas.txt
ATH is offline   Reply With Quote
Old 2019-11-20, 20:57   #583
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

31·173 Posts
Default

Quote:
Originally Posted by ATH View Post
It is an error from Linux not from CUDALucas, it never writes anything to my outputcudaluas.txt file:

Code:
*** buffer overflow detected ***: ./CUDALucas terminated
/bin/bash: line 1:  2509 Aborted                 (core dumped) ./CUDALucas >> outputcudalucas.txt
If you don't specify a path for the output text file, where does it go? I've had scripts fail unless I explicitly use ./whatever
kriesel is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Alternatives to Google Colab kriesel Cloud Computing 11 2020-01-14 18:45
Notebook enzocreti enzocreti 0 2019-02-15 08:20
Computer Diet causes Machine Check Exception -- need heuristics help Christenson Hardware 32 2011-12-25 08:17
Computer diet - Need help garo Hardware 41 2011-10-06 04:06
Workunit diet ? dsouza123 NFSNET Discussion 5 2004-02-27 00:42

All times are UTC. The time now is 07:30.


Sat Jul 17 07:30:52 UTC 2021 up 50 days, 5:18, 1 user, load averages: 1.09, 1.23, 1.37

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.