简体   繁体   中英

Serving real-time data python web application

I am writing a Python web application with the Flask framework, WSGIServer and geventwebsockets.

I have a thread pool of workers doing heavy processing work which then insert completed data into a MongoDB database. I want to be able to show a real-time stream of new data from MongoDB to the user on site.

What I have done at the moment is open a socket to connect with the client and poll MongoDB for new data every 3 seconds as shown here:

from flask import Flask
from flask_sockets import Sockets
import datetime

app = Flask(__name__)
sockets = Sockets(app)

@sockets.route('/echo')
def echo_socket(ws):
    last_tweet_printed = datetime.datetime.utcnow() - datetime.timedelta(seconds=55) #start printing tweets from 1 minute ago until catch up.
    while True:
        from database_functions import DatabaseFunctions
        import time
        databaseFunctions = DatabaseFunctions()
        tweets = databaseFunctions.loadTweets() # pulls latest tweets from database (all tweets from last 1 minute)

        limit = 5 # max to print out at once to browser
        index = 0

        for tweet in tweets:
            if(limit != index ):

                if(last_tweet_printed < tweet[u'created_at']): #if the last tweet is older than the one we just pulled...
                    last_tweet_printed = tweet[u'created_at'] #update the latest tweet from db...
                    tweet_text = tweet[u'text']

                    ws.send("<font color=\"blue\">"+tweet_text + "</font><br> <font color=\"red\">" + str(last_tweet_printed) + "</font><br>")

                else:
                    print('no new tweets in database, wait till next poll.\n')

                index+=1
            else:
                break

        print('sleeping...\n')
        time.sleep(3) #sleep for 3 seconds before polling mongoDB again.



@app.route('/')
def hello():
    return  \
'''
<html>
    <head>
        <title>Test Real-Time</title>
        <script type="text/javascript">
            var ws = new WebSocket("ws://" + location.host + "/echo");
            ws.onmessage = function(evt){
                    var received_msg = evt.data;
                    document.getElementById('mark_test').innerHTML += "Tweet: "+received_msg+"<br>";

                    //alert(received_msg);
            };

            ws.onopen = function(){
                ws.send("hello Mark!");
            };
        </script>

    </head>

    <body>
        <h1>Real Time Stream:</h1>
        <div id="mark_test">

        </div>
    </body>

</html>
'''


if __name__ == '__main__':
    from gevent import pywsgi
    from geventwebsocket.handler import WebSocketHandler
    server = pywsgi.WSGIServer(('', 5000), app, handler_class=WebSocketHandler)
    server.serve_forever()

Are there any limitations with the way this has been written? Are there any more efficient/best practise alternatives that could produce a more seamless stream to the users? I want the application to be able to handle a lot more requests to the database.

There are two problems with your approach. One, each client connected to this Flask server polls the database separately, so if you have 100 clients connected you're doing 100 queries every 3 seconds. Better to have one background thread poll the database every 3 seconds, and update the other threads. echo_socket could wait on a global Condition variable that's notified by the background thread after each update.

The other problem with your code is that you're short-polling MongoDB, when you could be long-polling. Long-polling would give you lower latency between a message arriving in the database and your broadcasting it to users, and it will reduce load on the server. Consider Rick Copeland's blog post on MongoDB pub/sub for inspiration.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM