mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > Cloud Computing

Reply
Thread Tools
Old 2019-10-16, 02:28   #342
Chuck
 
Chuck's Avatar
 
May 2011
Orange Park, FL

3×5×59 Posts
Default

Quote:
Originally Posted by mnd9 View Post
Sorry to repost about this issue -- but my main google account is having serious issues using Colab at this point. I can no longer get a session longer than a few hours, and as of this AM, I keep getting repeatedly disconnected after just a couple minutes. It's not saying no GPU is available, it's allowing me to reconnect then simply dropping me after 2-3 minutes.

Is there anything to do regarding this? All I've been doing is running mfaktc.

I have a secondary account that is running for the past 4-5 days reconnecting each 12 hours continuously with no issues.
I continue to have success using two Google accounts and alternating between them each 12 hours (only one running at a time). Would you want to consider additional accounts if you want to run more than one at a time?
Chuck is offline   Reply With Quote
Old 2019-10-16, 02:28   #343
Dylan14
 
Dylan14's Avatar
 
"Dylan"
Mar 2017

11038 Posts
Default

Quote:
Originally Posted by Dylan14 View Post
Actually, what I stated here was wrong, as using the newest available version from http://ftp.us.debian.org/debian/dists/testing/ (7.16.1) did not rectify the issue. So, to see what's going on, I switched back to the old code (shown here), which shows all the output. By doing so, the issue appears to be with the Internet connection:


Code:
13-Oct-2019 18:31:57 [---] Project communication failed: attempting access to reference site 

13-Oct-2019 18:32:00 [---] BOINC can't access Internet - check network connection or proxy configuration.
But it is clear that I am able to connect to the Internet and ping on Kaggle:


Code:
--- 8.8.8.8 ping statistics --- 

84 packets transmitted, 84 received, 0% packet loss, time 84956ms 

rtt min/avg/max/mdev = 0.294/0.423/0.989/0.127 ms
so this would suggest Kaggle is using a proxy. I have no idea what the settings would be though.

Returning back to this, there is no proxy on Kaggle. The solution was to switch from https:// to http://. Still find it weird that PrimeGrid and NFS@Home still use http for connecting to the server with the BOINC client.
Dylan14 is offline   Reply With Quote
Old 2019-10-16, 13:54   #344
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

9,767 Posts
Default

Quote:
Originally Posted by Dylan14 View Post
So to correct this, I used the getpass command for the account key, which will hide the key:
Very nice! Thank you!

This also works on Kaggle, so I can use it in my Tunnel scripts, to hide the root password.
chalsall is offline   Reply With Quote
Old 2019-10-16, 20:26   #345
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

2·1,579 Posts
Default

Seems to work fine running both mprime and mfaktc at the same time on Colab.

This assumes you have "mfaktc" folder with mfaktc_colab.exe on google drive and "mprime" folder with all the correct settings in prime.txt and local.txt.

If you do not want output files you can change
!./mprime -d >> outputmprime.txt 2>&1 &
/usr/local/bin/mfaktc_colab.exe >> outputmfaktc.txt

to

!./mprime -d > /dev/null 2>&1 &
/usr/local/bin/mfaktc_colab.exe > /dev/null


It seems to be only 1 CPU core probably 2 threads, getting 2.88 ms/iter on a 9.65M PRP CF, so just under 8 hours for that. Might be better to do PRP CF DC or ECM curves on it.


Code:
#@title
import os.path
from google.colab import drive

if not os.path.exists('/content/drive/My Drive'):
  drive.mount('/content/drive')
  
%cd '/content/drive/My Drive'
!chmod 700 mprime
%cd '/content/drive/My Drive/mprime'
!chmod 700 mprime
!./mprime -d >> outputmprime.txt 2>&1 &  

%cd '/content/drive/My Drive/mfaktc/'

!cp 'mfaktc_colab.exe' /usr/local/bin/
!chmod 755 '/usr/local/bin/mfaktc_colab.exe'

!cd '.' && LD_LIBRARY_PATH="lib:${LD_LIBRARY_PATH}" /usr/local/bin/mfaktc_colab.exe >> outputmfaktc.txt

!cat 'results.txt'
ATH is offline   Reply With Quote
Old 2019-10-17, 02:03   #346
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

2×1,579 Posts
Default

I made a simple script running mfaktc on kaggle, but how do you get the results out?

I planned to run mfaktc as a background process with:
!./mfaktc > /dev/null 2>&1 &

and then run a loop which uploaded the results every 15min or 30min, but kaggle does not allow background processes.

Last fiddled with by ATH on 2019-10-17 at 02:07
ATH is offline   Reply With Quote
Old 2019-10-17, 02:59   #347
Dylan14
 
Dylan14's Avatar
 
"Dylan"
Mar 2017

3×193 Posts
Default

After a bit of trial and error (plus some work in automating the checking of the logs for errors), I have a script that can be used to compile and run Mlucas, without needing to tunnel in. The script is below:


Code:
#code to compile and run Ernst Mayer's mlucas
import os
from google.colab import drive
if not os.path.exists('/content/drive/My Drive'):
  drive.mount('/content/drive')

!apt-get install gcc-8
%cd 'content/drive/My Drive/'
if not os.path.exists('/content/drive/My Drive/mlucas/'):
  !mkdir mprime
%cd '/content/drive/My Drive/mlucas//'
if not os.path.exists('mlucas_v18.txz'):
  !wget https://www.mersenneforum.org/mayer/src/C/mlucas_v18.txz
  !tar -xJf mlucas_v18.txz
  
#switch to the mlucas source directory
#check that we have both executables (one for avx2, and one for avx512)
if not os.path.exists('/content/drive/My Drive/mlucas/Mlucasavx512') or not os.path.exists('/content/drive/My Drive/mlucas/Mlucasavx2'):
  %cd '/content/drive/My Drive/mlucas/mlucas_v18/src'
  #we build mlucas twice. Once with avx2, and once with avx512.
  #first, the avx512 build:
  !gcc-8 -c -O3 -DUSE_AVX512 -march=skylake-avx512 -DUSE_THREADS *.c >& build1.log
  !grep error build1.log > erroravx512.log
  if os.stat("erroravx512.log").st_size == 0: #grep came up empty
    !gcc-8 -o Mlucasavx512 *.o -lm -lpthread -lrt
  else: #something went wrong
    !echo "Error in compilation. Check build.log and tell either Dylan14 (if you think Dylan made a mistake) or ewmayer."
    exit()
  #move Mlucasavx512 up a directory and clean up the src directory
  !mv Mlucasavx512 ..
  !rm *.o
  #now build the avx2 executable
  !gcc-8 -c -O3 -DUSE_AVX2 -mavx2 -DUSE_THREADS *.c >& build2.log
  !grep error build2.log > erroravx2.log
  if os.stat("erroravx2.log").st_size == 0: #grep came up empty
    !gcc-8 -o Mlucasavx2 *.o -lm -lpthread -lrt
  else: #something went wrong
    !echo "Error in compilation. Check build.log and tell either Dylan14 (if you think Dylan made a mistake) or ewmayer."
    exit()
  #move Mlucasavx2 up a directory and clean up the src directory
  !mv Mlucasavx2 ..
  !rm *.o

#now we check the processor we have
!echo "Checking processor so we can choose the right executable..."
%cd '/content/drive/My Drive/mlucas/mlucas_v18/'
#by default the permissions are not correct to run the mlucas
!chmod 755 Mlucasavx512
!chmod 755 Mlucasavx2
!grep avx512 /proc/cpuinfo > avx512.txt
!grep avx2 /proc/cpuinfo > avx2.txt
if os.stat("avx512.txt").st_size != 0: #avx512 is available
  !echo "AVX512 detected..."
  #test executable
  !./Mlucasavx512 -fftlen 192 -iters 100 -radset 0
  #performance tune with 2 threads (takes about 10 minutes)
  !./Mlucasavx512 -s m -cpu 0:1 >& selftest.log
  #to do: add code for managing worktodo.txt
  #then run Mlucas
  #!./Mlucasavx512
elif os.stat("avx2.txt").st_size != 0: #avx2 is available
  !echo "AVX2 detected..."
  #test executable
  !./Mlucasavx2 -fftlen 192 -iters 100 -radset 0
  #performance tune with 2 threads (takes about 10 minutes)
  !./Mlucasavx2 -s m -cpu 0:1 >& selftest.log
  #to do: add code for managing worktodo.txt
  #then run Mlucas
  #!./Mlucasavx2
else: #we have some other processor, which I think is fairly unlikely
  !echo "Strange. We don't have avx2 or avx512."
  exit()
Attached also is what the selftest.log and mlucas.cfg files should look like. At the current first time wavefront (length 5120 kdoubles, or fft length 5M) the time for one iteration is about 50 ms/iter on the Colab with AVX512 instructions.
Attached Files
File Type: log selftest.log (63.2 KB, 66 views)
File Type: txt mlucas.cfg.txt (1.8 KB, 70 views)
Dylan14 is offline   Reply With Quote
Old 2019-10-17, 03:00   #348
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

9,767 Posts
Default

Quote:
Originally Posted by ATH View Post
...but kaggle does not allow background processes.
Sure it does. Heck, I've have cron jobs running on it (hint: place an appropriately created crontab file in /etc/cron.d/ after installing and starting cron). Something like:
Code:
root@Colab_MAH:~/prime# ls -la /etc/cron.d/ 
total 20
drwxr-xr-x 2 root root 4096 Oct 17 02:58 .
drwxr-xr-x 1 root root 4096 Oct 16 22:45 ..
-rw-r--r-- 1 root root  102 Nov 16  2017 .placeholder
-rw-r--r-- 1 root root  112 Oct 17 02:58 iroot

root@Colab_MAH:~/prime# cat /etc/cron.d/iroot 
# Run system stats collection script every minute.
* * * * * root /root/bin/telemetry.pl >/dev/null 2>/dev/null
chalsall is offline   Reply With Quote
Old 2019-10-17, 03:24   #349
axn
 
axn's Avatar
 
Jun 2003

2×3×7×112 Posts
Default

Quote:
Originally Posted by ATH View Post
I made a simple script running mfaktc on kaggle, but how do you get the results out?

I planned to run mfaktc as a background process with:
!./mfaktc > /dev/null 2>&1 &

and then run a loop which uploaded the results every 15min or 30min, but kaggle does not allow background processes.
When you commit, it will run the job as a batch, and there will be a Version. Later on, you can come and look at the version, and you'll get a Output link, which will list all the files in the kaggle base directory, including results.txt
axn is offline   Reply With Quote
Old 2019-10-17, 03:31   #350
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

2·1,579 Posts
Default

Ok thanks. I came up with a loop that uses "head" command to create a worktodo.txt from another file with x number of lines and after finishing those it uploads results and creates another worktodo with "head".

I will look into the reverse SSH tomorrow maybe.
ATH is offline   Reply With Quote
Old 2019-10-17, 17:46   #351
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

9,767 Posts
Default

Quote:
Originally Posted by Dylan14 View Post
The script is below:
If I may please say, excellent work sir!

If I may please also say, however, that I'm always amused when people say that Perl is difficult to read...
chalsall is offline   Reply With Quote
Old 2019-10-17, 18:52   #352
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

10101001011002 Posts
Default

Quote:
Originally Posted by chalsall View Post
If I may please say, excellent work sir!

If I may please also say, however, that I'm always amused when people say that Perl is difficult to read...
1) Yes!
2) Depends on the author / style. Except pattern matches can be um, puzzling sometimes, in my opinion.

And the rapid progress of this collaborative effort has been something to see.

Last fiddled with by kriesel on 2019-10-17 at 18:55
kriesel is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Alternatives to Google Colab kriesel Cloud Computing 11 2020-01-14 18:45
Notebook enzocreti enzocreti 0 2019-02-15 08:20
Computer Diet causes Machine Check Exception -- need heuristics help Christenson Hardware 32 2011-12-25 08:17
Computer diet - Need help garo Hardware 41 2011-10-06 04:06
Workunit diet ? dsouza123 NFSNET Discussion 5 2004-02-27 00:42

All times are UTC. The time now is 16:27.


Mon Aug 2 16:27:06 UTC 2021 up 10 days, 10:56, 0 users, load averages: 3.11, 2.64, 2.44

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.