简体   繁体   English

去除非ASCII字符的多行字符串

[英]strip multiline string of non ascii chars

I am trying to get a string into the shell in IDLE. 我正在尝试将字符串放入IDLE的外壳中。 It contains some non-ascii characters that I would like to remove. 它包含一些我想删除的非ascii字符。 I can't just paste it into a multi line string, eg 我不能只是将其粘贴到多行字符串中,例如

u'''✔uganda
✔zambia
✔zimbabwe
and none of these…
✕afghanistan
✕armenia
✕azerbaijan'''

because that would give me the following error: 因为那样会给我以下错误:

Unsupported characters in input

and I can't use 而且我不能使用

string = [raw_]input()

because the string is more than one line wide. 因为该字符串超过一行。


How can I get the string into the shell? 如何将字符串放入外壳?

If you cannot define a string like this (on your machine), then you will need to input it. 如果您无法在计算机上定义这样的string ,则需要input它。 This means that you need some code to accept a multi-line input and as you read each line, check that ord() of each character is less than 256 (ie it is in the ASCII set). 这意味着您需要一些代码来接受multi-line输入,并且在阅读每一行时,请检查每个字符的ord()是否小于256 (即,在ASCII集中)。

Here is said code: 这是说的代码:

inpt = ''.join(c for c in input() if ord(c) < 256)
while True:
    s = ''.join(c for c in input() if ord(c) < 256)
    if s:
        inpt += "\n" + s
    else:
        break

and this works: 这有效:

✔uganda
✔zambia
✔zimbabwe
and none of these…
✕afghanistan
✕armenia
✕azerbaijan

>>> inpt
'uganda\nzambia\nzimbabwe\nand none of these\nafghanistan\narmenia\nazerbaijan'
>>> print(inpt)    
uganda
zambia
zimbabwe
and none of these
afghanistan
armenia
azerbaijan

If you have data stored in your session and can't just execute a script, then you could just run an input loop to copy in the full input. 如果您的会话中存储了数据,而不能仅仅执行脚本,那么您可以运行输入循环以复制完整的输入。

code: 码:

inp = ""


for line in iter(input, “”):
        inp += line

Define it in a script. 在脚本中定义它。 File, New, then: 新建文件,然后:

#!coding:utf8
s = u'''✔uganda
✔zambia
✔zimbabwe
and none of these…
✕afghanistan
✕armenia
✕azerbaijan'''
print s

Save it, then press F5 to run it in the IDLE shell. 保存它,然后按F5键在IDLE shell中运行它。 Output: 输出:

✔uganda
✔zambia
✔zimbabwe
and none of these…
✕afghanistan
✕armenia
✕azerbaijan

Alternatively, swith to the latest Python. 或者,使用最新的Python。 3.6's Idle works fine: 3.6的空闲状态很好:

>>> s='''\
✔uganda
✔zambia
✔zimbabwe
and none of these…
✕afghanistan
✕armenia
✕azerbaijan'''
>>> print(s)
✔uganda
✔zambia
✔zimbabwe
and none of these…
✕afghanistan
✕armenia
✕azerbaijan

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM