[英]How to take a user sentence and create a list of words out of it?
I'm unsure what the user will enter but I want to break their input sentence up into words in a list 我不确定用户会输入什么,但是我想将输入的句子分解为列表中的单词
User_input = raw_input("Please enter a search criterion: ")
User_Input_list[""]
# input example: steve at the office
# compiling the regular expression:
keyword = re.compile(r"\b[aA-zZ]\b")
for word in User_input:
User_Input_list.append(word?)
# going by thin put example input I'd want
# User_Input_list["steve", "at" , "the" , "office"]
I'm unsure how to split the input up into separate words? 我不确定如何将输入分成多个单词? I will give cookies for help!
我会给饼干寻求帮助!
User_Input_list = User_input.split()
The easiest solution is probably to use split
: 最简单的解决方案可能是使用
split
:
>>> "steve at the office".split()
['steve', 'at', 'the', 'office']
But this won't remove punctuation, which may or may not be a problem for you: 但这不会消除标点符号,这可能对您造成或可能不会造成问题:
>>> "steve at the office.".split()
['steve', 'at', 'the', 'office.']
You could use re.split()
to only pluck out letters: 您可以使用
re.split()
仅提取字母:
>>> re.split('\W+', 'steve at the office.')
['steve', 'at', 'the', 'office', '']
But as you can see above you might end up with empty entries to deal with, and things worse when you have more subtle punctuation: 但是正如您在上面看到的那样,您可能最终会得到空的条目要处理,而当您使用更细微的标点符号时,情况会更糟:
>>> re.split("\W+", "steve isn't at the office.")
['steve', 'isn', 't', 'at', 'the', 'office', '']
So you could do some work here to pick a better regular expression, but you'll need to decide how you want to handle text like steve isn't at the 'the office'
. 因此,您可以在此处进行一些工作,以选择更好的正则表达式,但是您需要确定如何处理文本,例如
steve isn't at the 'the office'
。
So to select the right solution for you, you'll have to think about what input you'll get and what output you want. 因此,要为您选择正确的解决方案,您必须考虑将要获得的输入和所需的输出。
Basicaly, Basicaly,
you should do this: 你应该做这个:
User_Input_list = User_input.split(' ')
and that's it... 就是这样...
User_input = raw_input("Please enter a search criterion: ")
User_Input_list = User_input.split(" ")
see: 看到:
http://docs.python.org/library/stdtypes.html http://docs.python.org/library/stdtypes.html
请执行下列操作
User_input = raw_input("Please enter a search criterion: ")
User_Input_list = User_input.split()
You found re already, there is a nice example of splitting a string: 您已经找到了,有一个很好的拆分字符串的示例:
re.split('\W+', 'Words, words, words.')
Like this you get all words, all punctuation removed. 这样,您将删除所有单词,删除所有标点符号。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.