简体   繁体   中英

How to speed up python networking?

I am finding python networking slow .

I have a server (written in C). I tested it with my client (python). I could reach 2MB/s . It worried me so I checked this:

host1 (client): cat some_big_file | nc host2 9999 cat some_big_file | nc host2 9999

host2 (server): nc -l 0.0.0.0 9999 | pv > /dev/null nc -l 0.0.0.0 9999 | pv > /dev/null

I reached something around 120MB/s (1Gb).

The server is not a bottleneck, we use it on production and it can handle more. But to be sure I copied simple python gevent server for tests. It looks like this:

  #!/usr/bin/env python
  from gevent.server import StreamServer
  from gevent.pool import Pool

  def handle(socket, address):
       while True:
           print socket.recv(1024)

  pool = Pool(20000)
  server = StreamServer(('0.0.0.0', 9999), handle, spawn=pool)
  server.serve_forever()

Next measure is to send from nc (host1) to gserver (host2) .

host1: cat some_big_file | nc host2 9999 cat some_big_file | nc host2 9999 host2: ./gserver.py | pv > /dev/null ./gserver.py | pv > /dev/null

The output on host2 : [ 101MB/s] . Not bad.

But still, when I use my python client, it's slow. I switched client to gevent .I've tried with several greenlets. 1, 10, 100, 1000 - it didn't help too much, I could reach 20MB/s with one python process or ~30MB/s for 2, 3, 4, 5 separate python processes, it's something, but still not so good). Still slow. I've rewritten the client to be dumb, like this:

#!/usr/bin/env python
import sys
import socket

c = socket.create_connection((sys.argv[1], sys.argv[2]))
while 1:
        c.send('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\n')

With this approach I could reach 10MB/s . I also tried the approach with reading the whole big 2GB file to memory and send it, similar result.

I also tried to run python scripts as separate processes (using tmux ). If I used 1 process I could reach 10MB/s , 2 processes 20MB/s , 3 prcesses 23MB/s , 4, 5, 6 processes didn't change anything (tested with gevent version and simple one).

Details: Python-2.7.3 Debian 7 - standard installation Machines are AWS instances, client is c1.medium and server is c3.xlarge. nc and iperf measured 1Gb/s between machines.

Questions:

  1. Why can I receive a lot of data quickly using python server (gevent server) but cannot send with the same speed even if C program can.
  2. Why does doubling the processes not increase sending speed to the limit, only to some value.
  3. Is there any way to send data fast in python using sockets?

The problem is not really that networking is slow - python function calls have a lot of overhead. If you call connection.send a lot of times, you're going to have a lot of wasted CPU time on function calls.

On my computer, your program averages about 35 MB/s. Doing a simple modification, I get 450 MB/s:

#...
c.send('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'*10+'\n')

I could reach speeds over 1GB/s, by sending even more data at once.

If you want to maximize your throughput, you should send as much data as possible in a single call to send . A simple way of doing it would be concatenating several strings before sending the final result. If you do this, remember python strings are immutable, so successive string concatenation (with large strings) is slow. You'll want to use a bytearray instead.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM