Uwsgi with gevent vs threads

Question

First of all, sorry for my bad English. In my project i have a lot of I/O network requests. Main data stored in another projects, and access is provided by web API (JSON/XML), polling. We use this APIs for each new user session (getting information about user). And sometimes, we have a problem with waiting for a response. We use nginx+uwsgi+django. As you know, Django is synchronous (or blocking). We use uwsgi with multithreading for solve problem with network IO waiting. I decided to read about gevent. I understand difference between cooperative and preemptive multitasking. And I hoped that gevent was better solution then uwsgi threads for this issue (network I/O bottleneck). But the results were almost identical. Sometimes gevent was weaker. Maybe somewhere I'm wrong. Tell me, please.

Here is uwsgi config examples. Gevent:

$ uwsgi --http :8001 --module ugtest.wsgi --gevent 40 --gevent-monkey-patch

Threading:

$ uwsgi --http :8001 --module ugtest.wsgi --enable-threads --threads 40

Controller example:

def simple_test_action(request):
    # get data from API without parsing (only for simple I/O test)
    data = _get_data_by_url(API_URL)
    return JsonResponse(data, safe=False)

import httplib
from urlparse import urlparse
def _get_data_by_url(url):
    u = urlparse(url)
    if str(u.scheme).strip().lower() == 'https':
        conn = httplib.HTTPSConnection(u.netloc)
    else:
        conn = httplib.HTTPConnection(u.netloc)
    path_with_params = '%s?%s' % (u.path, u.query, )
    conn.request("GET", path_with_params)
    resp = conn.getresponse()
    print resp.status, resp.reason
    body = resp.read()
    return body

Test (with geventhttpclient ):

def get_info(i):
    url = URL('http://localhost:8001/simpletestaction/')
    http = HTTPClient.from_url(url, concurrency=100, connection_timeout=60, network_timeout=60)
    try:
        response = http.get(url.request_uri)
        s = response.status_code
        body = response.read()
    finally:
        http.close()


dt_start = dt.now()
print 'Start: %s' % dt_start

threads = [gevent.spawn(get_info, i) for i in xrange(401)]
gevent.joinall(threads)
dt_end = dt.now()

print 'End: %s' % dt_end
print dt_end-dt_start

In both cases i have a similar time. What are the advantages of a gevent/greenlets and cooperative multitasking in a similar issue (API proxying)?

Answer 1

A concurrency of 40 is not such a level to let gevent shines. Gevent is about concurrency not parallelism (or per-request performance), so having such a "low" level of concurrency is not a good way to get improvements.

Generally you will see gevent concurrency with a level of thousands, not 40 :)

For blocking I/O python threads are not bad (the GIL is released during I/O), the advantage of gevent is in resource usage (having 1000 python threads will be overkill) and the removal of the need to think about locking and friends.

And obviously, remember that your whole app must be gevent-friendly to get an advantage, and django (by default) requires a bit of tuning (as an example database adapters must be changed with something gevent friendly).

Answer 2

Serving non-blocking is not about performance, it's about concurrency. If 99% of request time is spent in a sub-request, you can't just optimize those 99%. But when all available threads get busy serving, new clients are refused, although 99% of threads' time is spent in waiting for sub-request completion. Non-blocking serving lets you utilize that idle time by sharing it between "handlers" that are no more limited by the number of available threads. So if 99% is waiting, then the other 1% is CPU-bound processing, hence you can have 100x more connections simultaneously before you max out your CPU--without having 100x more threads, which may be too expensive (and with Python's GIL issue, you have to use sub-processes that are even more expensive).

Now, as roberto said, your code must be 100% non-blocking to be able to salvage the idle time. However, as you can see from the percent example above, it becomes critical only when the requests are almost completely IO-bound. If that's the case, it's likely you don't need Django, at least for that part of your app.

Uwsgi with gevent vs threads

Question

2 answers

solution1
6 ACCPTED 2015-01-12 07:04:20

solution2
1 2015-01-12 23:14:17

Uwsgi with gevent vs threads

Question

2 answers

solution1 6 ACCPTED 2015-01-12 07:04:20

solution2 1 2015-01-12 23:14:17

solution1
6 ACCPTED 2015-01-12 07:04:20

solution2
1 2015-01-12 23:14:17