简体   繁体   English

python:扩展的ASCII码

[英]python: extended ASCII codes

Hi I want to know how I can append and then print extended ASCII codes in python.嗨,我想知道如何在 python 中附加然后打印扩展的 ASCII 代码。 I have the following.我有以下内容。

code = chr(247)

li = []
li.append(code)
print li

The result python print out is ['\\xf7'] when it should be a division symbol.当它应该是除法符号时,python 打印出来的结果是 ['\\xf7']。 If I simple print code directly "print code" then I get the division symbol but not if I append it to a list.如果我直接简单地打印代码“打印代码”,那么我会得到除法符号,但如果我将其附加到列表中,则不会。 What am I doing wrong?我究竟做错了什么?

Thanks.谢谢。

When you print a list, it outputs the default representation of all its elements - ie by calling repr() on each of them.当你打印一个列表时,它输出它所有元素的默认表示——即通过对每个元素调用repr() The repr() of a string is its escaped code, by design.根据设计,字符串的repr()是其转义代码。 If you want to output all the elements of the list properly you should convert it to a string, eg via ', '.join(li) .如果您想正确输出列表的所有元素,您应该将其转换为字符串,例如通过', '.join(li)

Note that as those in the comments have stated, there isn't really any such thing as "extended ASCII", there are just various different encodings.请注意,正如评论中所述,实际上并没有“扩展 ASCII”之类的东西,只有各种不同的编码。

There is no such a thing such as "extend ASCII Codes" - there are however, plenty of characters, tens of thousands, as defined in the Unicode standards.没有像“扩展 ASCII 代码”这样的东西——但是,Unicode 标准中定义了大量的字符,数以万计。

You can be limited to the charset encoding of your text terminal, which you may think of as "Extend ASCII", but which might be "latin-1", for example (if you are on a Unix system such as Linux or Mac OS X, your text terminal will likely use UTF-8 encoding, and able to display any of the tens of thousands chars available in Unicode)您可以限制为您的文本终端的字符集编码,您可能会认为它是“扩展 ASCII”,但也可能是“latin-1”,例如(如果您使用的是 Unix 系统,例如 Linux 或 Mac OS X,您的文本终端可能会使用 UTF-8 编码,并且能够显示 Unicode 中可用的数万个字符中的任何一个)

So, you must read this piece in order to understand what text is, after 1992 - If you try to do any production application believing in "extended ASCII" you are harming yourself, your users and the whole eco-system at once: http://www.joelonsoftware.com/articles/Unicode.html因此,您必须阅读这篇文章以了解 1992 年之后的文本是什么 - 如果您尝试执行任何相信“扩展 ASCII”的生产应用程序,您将同时伤害您自己、您的用户和整个生态系统: http: //www.joelonsoftware.com/articles/Unicode.html

That said, Python2's (and Python3's) print will call the an implicit str conversion for the objects passed in. If you use a list, this conversion does not recursively calls str for each list element, instead, it uses the element's repr, which displays non ASCII characters as their numeric representation or other unsuitable notations.也就是说,Python2(和 Python3)的print将为传入的对象调用隐式 str 转换。如果您使用列表,则此转换不会为每个列表元素递归调用str ,而是使用元素的 repr,它显示非 ASCII 字符作为其数字表示或其他不合适的符号。

You can simply join your desired characters in a unicode string, for example, and then print them normally, using the terminal encoding:例如,您可以简单地将所需字符加入 unicode 字符串中,然后使用终端编码正常打印它们:

import sys

mytext = u""
mytext += unichr(247) #check the codes for unicode chars here:  http://en.wikipedia.org/wiki/List_of_Unicode_characters

print mytext.encode(sys.stdout.encoding, errors="replace")

You probably want the charmap encoding, which lets you turn unicode into bytes without 'magic' conversions.您可能需要 charmap 编码,它允许您将 unicode 转换为字节而无需“魔术”转换。

s='\xf7'
b=s.encode('charmap')
with open('/dev/stdout','wb') as f:
    f.write(b)
    f.flush()

Will print ÷ on my system.将在我的系统上打印÷

Note that 'extended ASCII' refers to any of a number of proprietary extensions to ASCII, none of which were ever officially adopted and all of which are incompatible with each other.请注意,“扩展 ASCII”是指 ASCII 的许多专有扩展中的任何一个,这些扩展都没有被正式采用,并且所有这些扩展都彼此不兼容。 As a result, the symbol output by that code will vary based on the controlling terminal's choice of how to interpret it.因此,该代码输出的符号将根据控制终端对如何解释它的选择而有所不同。

You are doing nothing wrong.你没有做错任何事。

What you do is to add a string of length 1 to a list.您要做的是将长度为 1 的字符串添加到列表中。

This string contains a character outside the range of printable characters, and outside of ASCII (which is only 7 bit).该字符串包含超出可打印字符范围且超出 ASCII(仅 7 位)的字符。 That's why its representation looks like '\\xf7' .这就是为什么它的表示看起来像'\\xf7'

If you print it, it will be transformed as good as the system can.如果你打印它,它会尽可能地被系统改造。

In Python 2, the byte will be just printed.在 Python 2 中,将只打印字节。 The resulting output may be the division symbol, or any other thing, according to what your system's encoding is.根据您系统的编码,结果输出可能是除法符号或任何其他内容。

In Python 3, it is a unicode character and will be processed according to how stdout is set up.在 Python 3 中,它是一个 unicode 字符,将根据stdout的设置方式进行处理。 Normally, this indeed should be the division symbol.通常,这确实应该是除法符号。

In a representation of a list, the __repr__() of the string is called, leading to what you see.在列表的表示中,字符串的__repr__()被调用,导致您看到的内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM