简体   繁体   中英

Encoding characters with ISO 8859-1 in Python

With ord(ch) you can get a numerical code for character ch up to 127 . Is there any function that returns a number from 0-255, so to cover also ISO 8859-1 characters?
Edit: Follows my last version of code and error I get

#!/usr/bin/python
# coding: iso-8859-1

import sys
reload(sys)
sys.setdefaultencoding('iso-8859-1')
print sys.getdefaultencoding()  # prints "iso-8859-1" 

def char_code(c):
    return ord(c.encode('iso-8859-1'))
print char_code(u'à')

I get an error: TypeError: ord() expected a character, but string of length 2 found

When you're starting with a Unicode string, you need to encode rather than decode .

>>> def char_code(c):
        return ord(c.encode('iso-8859-1'))

>>> print char_code(u'à')
224

For ISO-8859-1 in particular, you don't even need to encode it at all, since Unicode uses the ISO-8859-1 characters for its first 256 code points.

>>> print ord(u'à')
224

Edit: I see the problem now. You've given a source code encoding comment that indicates the source is in ISO-8859-1. However, I'll bet that your editor is actually working in UTF-8. The source code will be mis-interpreted, and the single-character string you think you created will actually be two characters. Try the following to see:

print len(u'à')

If your encoding is correct, it will return 1 , but in your case it's probably 2 .

You can get ord() for anything. As you might expect, ord(u'💩') works fine, provided you can represent the character properly in your source, and/or read it in a known encoding.

Your error message vaguely suggests that coding: iso-8859-1 is not actually true, and the file's encoding is actually something else (UTF-8 or UTF-16 would be my guess).

The canonical must-read on character encoding in Python is http://nedbatchelder.com/text/unipain.html

You can still use ord() , but you have to decode it.

Like this:

def char_code(c):
    return ord(c.decode('iso-8859-1'))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM