[英]How to remove words containing only numbers in python?
I have some text in Python which is composed of numbers and alphabets. 我在Python中有一些由数字和字母组成的文本。 Something like this:
像这样的东西:
s = "12 word word2"
From the string s, I want to remove all the words containing only numbers 从字符串s,我想删除所有只包含数字的单词
So I want the result to be 所以我想要结果
s = "word word2"
This is a regex I have but it works on alphabets ie it replaces each alphabet by a space. 这是我的正则表达式,但它适用于字母表,即它用空格替换每个字母表。
re.sub('[\ 0-9\ ]+', ' ', line)
Can someone help in telling me what is wrong? 有人可以帮我告诉我什么是错的吗? Also, is there a more time-efficient way to do this than regex?
此外,还有比正则表达式更有效的方法吗?
Thanks! 谢谢!
Using a regex is probably a bit overkill here depending whether you need to preserve whitespace: 在这里使用正则表达式可能有点过分,这取决于您是否需要保留空格:
s = "12 word word2"
s2 = ' '.join(word for word in s.split() if not word.isdigit())
# 'word word2'
You can use this regex: 你可以使用这个正则表达式:
>>> s = "12 word word2"
>>> print re.sub(r'\b[0-9]+\b\s*', '', s)
word word2
\\b
is used for word boundary and \\s*
will remove 0 or more spaces after your number word. \\b
用于单词边界, \\s*
将删除数字后的0或更多空格。
Without using any external library you could do: 不使用任何外部库,您可以这样做:
stringToFormat = "12 word word2"
words = ""
for word in stringToFormat.split(" "):
try:
int(word)
except ValueError:
words += "{} ".format(word)
print(words)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.