[英]How to connect consecutive words of a text in a dictionary?
I have to connect consecutive words of a text in a dictionary. 我必须在词典中连接文本的连续单词。
The text is: 文本是:
text = "Hello world I am Josh"
The dictionary would be: 字典将是:
dict = {Hello:[world], world:[Hello, I], I:[am, world], am:[I, Josh], Josh:[am]}
The keys are all the words in the text, the values are the consecutive words. 键是文本中的所有单词,值是连续的单词。 Anyone has an idea to abstain this?
有人有弃权的想法吗?
Using the pairwise
recipe from itertools
: 使用
itertools
的pairwise
配方 :
def pairwise(iterable):
a, b = tee(iterable)
next(b, None)
return izip(a, b)
adjacent = collections.defaultdict(list)
for left, right in pairwise(text.split()):
adjacent[right].append(left)
adjacent[left].append(right)
Your question doesn't consider the possibility that a word appears in the sentence more than once. 您的问题没有考虑单词在句子中出现多次的可能性。 You might want a
set
rather than a list
of adjacent words. 您可能需要一
set
而不是相邻单词的list
。 Punctuation in the sentence could also ruin your day, so depending on your requirements you might need to do more than just split()
. 句子中的标点符号也可能会破坏您的一天,因此根据您的要求,您可能需要做的不只是
split()
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.