简体   繁体   中英

Printing all unicode characters in Python

I've written some code to create all 4-digit combinations of the hexidecimal system, and now I'm trying to use that to print out all the unicode characters that are associated with those values. Here's the code I'm using to do this:

char_list =["0","1","2","3","4","5","6","7","8","9","A","B","C","D","E","F"]
pairs = []
all_chars = []

# Construct pairs list
for char1 in char_list:
    for char2 in char_list:
        pairs.append(char1 + char2)

# Create every combination of unicode characters ever
    for pair1 in pairs:
        for pair2 in pairs:
            all_chars.append(pair1 + pair2)

# Print all characters
for code in all_chars:
    expression = "u'\u" + code + "'"
    print "{}: {}".format(code,eval(expression))

And here is the error message I'm getting:

Traceback (most recent call last): File "C:\Users\andr7495\Desktop\unifun.py", 
line 18, in <module> print "{}: {}".format(code,eval(expression))
UnicodeEncodeError: 'ascii' codec can't encode character u'\x80' in position 0: 
ordinal not in range(128)

The exception is thrown when the code tries to print u"\€", however, I can do this in the interactive interpreter without a problem.

I've tried casting the results to unicode and specifying to ignore errors, but it's not helping. I feel like I'm missing a basic understanding about how unicode works, but is there anything I can do to get my code to print out all valid unicode expressions?

import sys
for i in xrange(sys.maxunicode): 
  print unichr(i);

it is likely a problem with your terminal (cmd.exe is notoriously bad at this) as most of the time when you "print" you are printing to a terminal and that ends up trying to do encodings ... if you run your code in idle or some other space that can render unicode you should see the characters. also you should not use eval try this

for uni_code in range(...):
    print hex(uni_code),unichr(uni_code)

You're trying to format a Unicode character into a byte string. You can remove the error by using a Unicode string instead:

print u"{}: {}".format(code,eval(expression))
      ^

The other answers are better at simplifying the original problem however, you're definitely doing things the hard way.

Here's a rewrite of examples in this article that saves the list to a file.

Python 3.x:

import sys 
txtfile = "unicode_table.txt"
print("creating file: " + txtfile) 
F = open(txtfile, "w", encoding="utf-16", errors='ignore')
for uc in range(sys.maxunicode):
    line = "%s %s" % (hex(uc), chr(uc))
    print(line, file=F)
F.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM