简体   繁体   English

Python将字符串转换为忽略特殊字符的列表

[英]Python Converting string into a list ignoring the special characters

I am having a string as :-我有一个字符串:-

'Current Weather\n\t\n\n\t\t12:36 AM\n\t\n\n\n\n\t\t\t22°\n\t\t\n\n\t\t\t\tC\n\t\t\t\n\n\n\t\tRealFeel®\n\t\t20°\n\t\n\n\t\tMostly clear'

I want to convert it into a list as :-我想将其转换为列表:-

['Current Weather','12:36 AM','22°','C','RealFeel®','20°','Mostly clear']

Is there any python module or function with which I can do so?有没有我可以这样做的python模块或函数?

You can use re.split :您可以使用re.split

import re

s = 'Current Weather\n\t\n.....t\tMostly clear'
re.split(r'[\n\t]+', s)

Output:输出:

['Current Weather', '12:36 AM', '22°', 'C', 'RealFeel®', '20°', 'Mostly clear']

Why is everybody using re ?为什么每个人都在使用re This library is very slow.You can just use str.split ,if you use it with arguments,you will have to do the str.isspace by hand,but it's still pretty fast,this is the code:这个库慢。你可以只使用str.split ,如果你用参数使用它,你将不得不手动执行str.isspace ,但它仍然很快,这是代码:

>>> [i.strip() for i in s.split('\n\t') if not i.isspace()]
['Current Weather', '12:36 AM', '22°', 'C', 'RealFeel®', '20°', 'Mostly clear']

Benchmarks:基准:

>>> timeit.timeit(r"re.split(r'[\n\t]+', s)",r"""
import re
s = 'Current Weather\n\t\n\n\t\t12:36 AM\n\t\n\n\n\n\t\t\t22°\n\t\t\n\n\t\t\t\tC\n\t\t\t\n\n\n\t\tRealFeel®\n\t\t20°\n\t\n\n\t\tMostly clear'
""")
2.8587728
timeit.timeit(r"[i.strip() for i in s.split('\n\t') if not i.isspace()]",r"""import re

s = 'Current Weather\n\t\n\n\t\t12:36 AM\n\t\n\n\n\n\t\t\t22°\n\t\t\n\n\t\t\t\tC\n\t\t\t\n\n\n\t\tRealFeel®\n\t\t20°\n\t\n\n\t\tMostly clear'
""")
1.8853902

Without regex:没有正则表达式:

[x.strip() for x in st.splitlines() if x.strip()!= '']

output:输出:

['Current Weather', '12:36 AM', '22°', 'C', 'RealFeel®', '20°', 'Mostly clear']

You could use Python regex.您可以使用 Python 正则表达式。 Here is an example:下面是一个例子:

import re
def sentance_to_list(sentence):
ls=re.split(r'["\t|\n"]\s*', sentence)   # split \t or \n
return ls

strr='Current Weather\n\t\n\n\t\t12:36 AM\n\t\n\n\n\n\t\t\t22°\n\t\t\n\n\t\t\t\tC\n\t\t\t\n\n\n\t\tRealFeel®\n\t\t20°\n\t\n\n\t\tMostly clear'
newstrr=sentance_to_list(strr)
print(newstrr) 

output:输出:

['Current Weather', '12:36 AM', '22°', 'C', 'RealFeel®', '20°', 'Mostly clear']

You could read more on re https://docs.python.org/3/library/re.html您可以在https://docs.python.org/3/library/re.html上阅读更多内容

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM