出现次数多的单词的NLTK索引

Question

我正在尝试使用python在以下文本中查找单词'the'的索引

sent3 = ['In', 'the', 'beginning', 'God', 'created', 'the', 'heaven', 'and', 'the', 'earth', '.']

如果我确实sent3.index('the') ，则得到1 ，这是该单词首次出现的索引。 我不确定是如何找到其他出现“ the”的索引。 有人知道我该怎么做吗？

谢谢！

Answer 1

[i for i, item in enumerate(sent3) if item == wanted_item]

演示：

>>> sent3 = ['In', 'the', 'beginning', 'God', 'created', 'the', 'heaven', 'and', 'the', 'earth', '.']
>>> [i for i, item in enumerate(sent3) if item == 'the']
[1, 5, 8]

enumerate只是从一个可迭代对象构造一个元组list ，包括它们的值和相应的索引。 我们可以使用它来检查该值是否是我们想要的值，如果是，则从中拉出索引。

Answer 2

>>> from collections import defaultdict
>>> sent3 = ['In', 'the', 'beginning', 'God', 'created', 'the', 'heaven', 'and', 'the', 'earth', '.']
>>> idx = defaultdict(list)
>>> for i,j in enumerate(sent3):
...     idx[j].append(i)
... 
>>> idx['the']
[1, 5, 8]

出现次数多的单词的NLTK索引

问题描述

2 个解决方案

解决方案1
1 已采纳 2014-04-13 15:50:15

解决方案2
0 2014-04-14 10:18:55

出现次数多的单词的NLTK索引

问题描述

2 个解决方案

解决方案1 1 已采纳 2014-04-13 15:50:15

解决方案2 0 2014-04-14 10:18:55

解决方案1
1 已采纳 2014-04-13 15:50:15

解决方案2
0 2014-04-14 10:18:55