简体   繁体   中英

Unicode Characters in Twitter (Python)

I've learned how to send tweets with Python, but I'm wondering if it's possible to send emojis or other special Unicode characters in the tweets.

For example, when I try to tweet u'1F430', it simply shows up as "1F430" in the tweet.

u'1F430' is the literal string "1F430". What character are you trying to get? In general you can get literal bytes into a python string using "\\x20", eg

>>> print(b"#\x20#")
# #

The byte with hexadecimal value of 20 (decimal 32) in between 2 hashes. Bytes are decoded as ASCII by default, and ASCII char (hex) 20 is a space.

>>> print(u"#\u0020#")
# #
>>> print(u"#\U0001F430#")
# #

Unicode codepoint 20 (a single space) in the middle of 2 hashes

See https://docs.python.org/3.3/howto/unicode.html for more info. NB It can get a little confusing since python will implicitly convert between bytes and unicode (using the ASCII encoding) in a lot of cases, which can hide the issue from you for a while.

>>> len(u'1f430')
5
>>> len(u'\U0001F430') 
1 # the latter might be equal to two in Python 2 on a narrow build (Windows, OS X)

The former is 5 characters, the latter is a single character.

If you want to specify the character in Python source code then you could use its name for readability:

>>> print(u"\N{RABBIT FACE}")
🐰

Note: it might not work in Windows console. To display non-BMP Unicode characters there, you could use win-unicode-console + ConEmu .

If you are reading it from a file, network, etc then this character is no different from any other: to decode bytes into Unicode text, you should specify a character encoding eg:

import io

with io.open('filename', encoding='utf-8') as file:
    text = file.read()

Which specific encoding to use depends on the source eg, see A good way to get the charset/encoding of an HTTP response in Python

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM