Making a distributed computing network in python

Question

so I have a huge amount of data to process, to do it I'm using everything I can get, my parents computers, my girlfriend computer, my computers, my brothers computers.

They are ok with lending me some of their processing power, and the processing programs only uses 1 of the 4 cores of their computer. I'll set up something that will launch the slaves at their computer startup

I coded this "distributed computing program" by myself, I just learned about sockets with google and I want to make sure that I'm not making a big mistake

From what I understand a socket is one way only, A can only send data to B, if B needs to send data to A then an other socket on an other port need to be opened.

the "distributor" is the program that orchestrates the computing, it sends data to crunch to all the slaves, it is running on a cheap dedicated server
the "slaves" ask data from the distributor and compute stuff, store the result, then ask for more data to crunch

the "distributor" has a registration_port_distributor : 15555
the "slaves" have a registration_port_slave : 14444 (yes the same for each slaves)
work_port = registration_port_distributor + 1

the distributer boots
start of the loop
    wait for a slave connection
    a slave connect to port 15555 (registration_port_distributor) and tell the distributor "I am 'slave_name', give me 2 ports to work on my port 14444 (registration_port_slave)"
    the disbtributor connect to the slave on port 'registration_port_slave' and give it "work_port" (data_reception_port) for receiving data and work_port+1 (data_request_port) so that the slave can request new data to crunch
    work_port is incremented by 2

from this point a slave can receive data to process from a connection on 'data_reception_port' and it can ask for new data to crunch from a connection on 'data_request_port'

The only problem I can see here is if 2 slaves try to connect at the same time, but that is easily fixed using a while loop on each slave with a 5 second sleep for reattempting a connection.

What do you think?

Thanks.

ps : yes the slaves do not send back the result, I will get them manually, or implement that later.

pps : will be uploaded to my github later, the code is a mess right now, I am testing various things.

Answer 1

From what I understand a socket is one way only, A can only send data to B, if B needs to send data to A then an other socket on an other port need to be opened.

As already mentioned by several people in the comments, a TCP socket is bi-directional, and you can use the same for two way communication. The application has to be coded in such a way that both side understand each other.

from this point a slave can receive data to process from a connection on 'data_reception_port' and it can ask for new data to crunch from a connection on 'data_request_port'

Once you change your application model to the way as explained above, you'd no longer require to communicate using two separate ports/connections each side.

The only problem I can see here is if 2 slaves try to connect at the same time, but that is easily fixed using a while loop on each slave with a 5 second sleep for reattempting a connection.

Please read about the backlog in Socket communications. If the count of incoming requests are more than which can be served at the moment, the requests would be queued (the exact number of requests waiting in the queue depends on the backlog parameter). Check documentation of socket.listen([backlog]) function for more information.

I hope this answers your questions. Please feel free to query further in case of any confusion.

Making a distributed computing network in python

Question

1 answers

solution1
1 ACCPTED 2018-02-13 19:25:15

Making a distributed computing network in python

Question

1 answers

solution1 1 ACCPTED 2018-02-13 19:25:15

solution1
1 ACCPTED 2018-02-13 19:25:15