mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Miscellaneous Math (https://www.mersenneforum.org/forumdisplay.php?f=56)
-   -   Checking of Collatz problem / conjecture (https://www.mersenneforum.org/showthread.php?t=24661)

 dabler 2019-09-04 16:14

Distributed computing project

I decided to start a distributed computing project. The aim is to raise the threshold below which the convergence of the Collatz problem is verified, particularly from 87 × 2^60 to 88 × 2^60. I keep checksums of each verified block, so my research can be verified. All source code is [URL="https://github.com/xbarin02/collatz"]open[/URL], and all compute nodes are already running. The nodes involved in this distributed computing are nodes in a cluster available at our university and nodes in [URL="https://www.it4i.cz/"]IT4I[/URL]. For those who are interested, the progress can be tracked [URL="http://pcbarina2.fit.vutbr.cz/~ibarina/cgi-bin/status.sh"]here[/URL].

 Dylan14 2019-09-04 21:40

I’ve managed to compile the code on a Ubuntu 19.04 VM (it doesn’t compile using Windows 64 bit and mingw64) and I have attached my computer to the project. My question is how large are the work units? As when I run ./client I have assignments 91231623-91231628 and you are currently searching the superblock 88*2^60?

 Dylan14 2019-09-05 00:46

[QUOTE=Dylan14;525195]My question is how large are the work units?[/QUOTE]

My machine just turned in its first results for the project. Running 6 tasks at once, it takes about 2h 22m to do the tasks. If I know the size of the tasks, I can compute the rate of the calculation.
FYI, I am using an Intel i7-8750H to run this.

 dabler 2019-09-05 17:01

Hi Dylan,

The code requires 64-bit long int type, and GCC's __int128 extension. It should compile fine on 64-bit Linux machines. The size of a single work unit is 2^40 numbers (currently somewhere between 87 × 2^60 and 88 × 2^60).

Today I was faced some technical issues. The master server could not handle the number of connected clients (tens of thousands). For this reason, I had to rewrite the communication protocol to distribute the assignments in batches. However, everything is already done, and the computation nodes are running again. The current progress can be tracked at: [url]http://pcbarina2.fit.vutbr.cz/~ibarina/cgi-bin/status.sh[/url]

If you want to join the project, please use "mclient" instead of the older "client". The mclient communicates using single TCP/IP connection, saving the server resources.

A single work unit (2^40 number) usually takes something between one and five hours, depending on the machine.

 dabler 2019-09-05 17:13

Just for the sake of curiosity, most clients run on [URL="https://docs.it4i.cz/salomon/hardware-overview/"]this computing cluster[/URL]. So tens of thousands of TCP/IP connections soon became a bottleneck.

 Dylan14 2019-09-05 19:00

[quote=dabler;525255]The size of a single work unit is 2^40 numbers (currently somewhere between 87 × 2^60 and 88 × 2^60).[/quote]

With that in mind I get a throughput of about 7.74*10^8 numbers per second on 6 threads of a i7-8750H. I will also recompile the code and use mclient instead.
A suggestion (in the case more people get interested in running this, not so much if this will be just a long running job on your cluster): say after the client has run 2^35 numbers (about 3% of the workunit), write a checkpoint file and then if the user needs to interrupt the computation they can launch the client again and the client will restart from the number recorded in the checkpoint file.

 dabler 2019-09-06 12:16

Good point for an improvement. However, when I chose this work unit (2^40), I estimated that modern computers would give it in about one hour.

One more question: When you said "Running 6 tasks at once, it takes about 2h 22m to do the tasks", you meant running 6 tasks on 6 cpu cores, not on a single core (that would be extremely fast), right?

 Dylan14 2019-09-06 12:31

By that statement, I mean 6 cores, each working on 1 task, not a single core running 6 tasks.

 dabler 2019-09-08 07:35

Windows build

According to [URL="https://stackoverflow.com/questions/7607502/sizeoflong-in-64-bit-c/39207744#39207744"]this post[/URL], you should be able to compile the code for 64-bit windows using Cygwin x86_64. Mingw-w64 doest not work as expected. Would you be willing to try it?

 Dylan14 2019-09-08 12:51

I was able to get the Collatz code to compile under Windows 10 64 bit using Cygwin.

 dabler 2019-09-08 13:09

Cool! And does the client work properly? I mean whether the client connects to the server, computes its assignment, and returns the result correctly to the server.

All times are UTC. The time now is 04:27.