简体   繁体   中英

How do I display UTF-8 characters sent through a websocket?

I'm trying to build a simple web socket server that loads a file with some tweets in it (as CSV) and then just sends the string of the tweet to a web browser through a websocket. Here is a gist with the sample that I'm using for testing. Here's the Autobahn server component ( server.py ):

import random
import time
from twisted.internet   import reactor
from autobahn.websocket import WebSocketServerFactory, \
                               WebSocketServerProtocol, \
                               listenWS


f = open("C:/mypath/parsed_tweets_sample.csv")

class TweetStreamProtocol(WebSocketServerProtocol):

    def sendTweet(self):
        tweet = f.readline().split(",")[2]
        self.sendMessage(tweet, binary=False)

    def onMessage(self, msg, binary):
        self.sendTweet() 

if __name__ == '__main__':

   factory = WebSocketServerFactory("ws://localhost:9000", debug = False)
   factory.protocol = TweetStreamProtocol
   listenWS(factory)
   reactor.run()

And here is the web component ( index.html ):

<html>
   <head>
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <script type="text/javascript"> 
            var ws = new WebSocket("ws://localhost:9000");

            ws.onmessage = function(e) {
               document.getElementById('msg').textContent = e.data; //unescape(encodeURIComponent(e.data));
               console.log("Got echo: " + e.data);
            }
      </script>
   </head>
   <body>
      <h3>Twitter Stream Visualization</h3>
      <div id="msg"></div>
      <button onclick='ws.send("tweetme");'>
         Get Tweet
      </button>
   </body>
</html>

When the tweet arrives in the browser, the UTF-8 characters aren't properly displayed. How can I modify these simple scripts to display the proper UTF-8 characters in the browser?

This works for me:

from autobahn.twisted.websocket import WebSocketServerProtocol, \
                                       WebSocketServerFactory


class TweetStreamProtocol(WebSocketServerProtocol):

   def sendTweets(self):
      for line in open('gistfile1.txt').readlines():
         ## decode UTF8 encoded file
         data = line.decode('utf8').split(',')

         ## now operate on data using Python string functions ..

         ## encode and send payload
         payload = data[2].encode('utf8')
         self.sendMessage(payload)

      self.sendMessage((u"\u03C0"*10).encode("utf8"))

   def onMessage(self, payload, isBinary):
      if payload == "tweetme":
         self.sendTweets()



if __name__ == '__main__':

   import sys

   from twisted.python import log
   from twisted.internet import reactor

   log.startLogging(sys.stdout)

   factory = WebSocketServerFactory("ws://localhost:9000", debug = False)
   factory.protocol = TweetStreamProtocol

   reactor.listenTCP(9000, factory)
   reactor.run()

Notes:

  • above code is for Autobahn|Python 0.7 and above
  • I'm not sure if you sample Gist is properly UTF8 encoded file
  • However, the "last" pseudo Tweet is 10x "pi", and that properly shows in the browser, so it works in principle ..

Also note: for reasons too long to explain here, Autobahn's sendMessage function expects payload to be already UTF8 encoded if isBinary == False . A "normal" Python string is Unicode, which needs to be encoded like above to UTF8 for sending.

instead of < meta http-equiv="content-type" content="text/html; charset=UTF-8"> < try < meta charset = utf-8> .
if you're using XHTML then write <meta charset = utf-8 />

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM