简体   繁体   中英

UnicodeDecodeError: 'utf8' codec can't decode byte 0xcb in position 5: invalid continuation byte

my web app ran very well before, but days ago a problem came, now i can start my web app, but when i browse my site from local(127.0.0.1) or remote(192.168.xxx.xxx)(only simply open the homepage, no inputs from mouse and keyboard), crashs the webapp like this:

Traceback (most recent call last):
File "/path/to/project/web/application.py", line 242, in process
  return self.handle()
File "/path/to/project/web/application.py", line 233, in handle
  return self._delegate(fn, self.fvars, args)
File "/path/to/project/web/application.py", line 415, in _delegate
  return handle_class(cls)
File "/path/to/project/web/application.py", line 390, in handle_class
  return tocall(*args)
File "./my_web_app.py", line 40, in GET
  simplejson.dumps(manus))
File "/usr/lib/python2.7/dist-packages/simplejson/__init__.py", line 286, in dumps
  return _default_encoder.encode(obj)
File "/usr/lib/python2.7/dist-packages/simplejson/encoder.py", line 226, in encode
  chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python2.7/dist-packages/simplejson/encoder.py", line 296, in iterencode
  return _iterencode(o, 0)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xcb in position 5: invalid continuation byte
192.168.xxx.xxx:2131 - - [27/Nov/2013 16:51:09] "HTTP/1.1 GET /" - 500 Internal Server Error
192.168.xxx.xxx:2131 - - [27/Nov/2013 16:51:09] "HTTP/1.1 GET /favicon.ico" - 404 Not Found
192.168.xxx.xxx:2131 - - [27/Nov/2013 16:51:09] "HTTP/1.1 GET /favicon.ico" - 404 Not Found

and I dont think there is some thing wrong with my codes, because my codes run very well in my computer, the error appears only when it runs on the server. The directory "web" is a link to "web.py-0.34/web", it is not my codes.

my codes are simple:

urls = (
    '/', 'find_alternate',
    '/find_alternates', 'find_alternate',
    '/show_detail/(.+)', 'show_detail'
)
app = web.application(urls, globals())
class find_alternate:
    def GET(self):
        brands = [b.brandName for b in Brand.q.all()]
        brands.sort()
        manus = [oe.brandName for oe in OeNumber.q.group_by(OeNumber.brandName)]
        manus.sort()
        return render.find_alternates_main(simplejson.dumps(brands), simplejson.dumps(manus))
"""
some more functions, but not relevant
"""
render = web.template.render('/path/to/templates/')
web.template.Template.globals['str'] = str
if __name__ == "__main__":
    app.run()

my CREATE TABLE:

CREATE TABLE `brand` (
  `brandNo` int(11) NOT NULL,
  `brandName` varchar(64) DEFAULT NULL,
  PRIMARY KEY (`brandNo`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |

My problem is now hwo to convert character Ë from Unicode to utf-8, so that jsonsimple can parse it. In wiki I found this:

Unicode: U+00CB
UTF-8: C3(hex) 8B(hex)

How I soluted: added the following lines to my.cnf:

collation-server = utf8_unicode_ci
init_connect='SET NAMES utf8'
character-set-server = utf8
skip-character-set-client-handshake

converted database to utf-8:

ALTER DATABASE `db_name` DEFAULT CHARACTER SET utf8 COLLATE utf8_bin;

u'\\xcb' is an unicode representation of '\\xc3\\x8b' ,

>>> u'CITRO\xcbN'.encode('utf-8')
'CITRO\xc3\x8bN'

and its latin-1 encoding:

>>> u'CITRO\xcbN'.encode('latin-1')
'CITRO\xcbN'

So your server db seems to be not utf-8 encoded.

I think best solution would be to check your server tables encoding, and if it is not utf8 , migrate to utf8 . If tables are in utf8, you have to fix data, as data is not.

Alternatively, you can infer encoding from db settings and pass to simplejson:

simplejson.dumps(manus, encoding=encoding)

But this approach will lead to difference between server and dev and errors in future.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM