简体   繁体   English

为元素列表中的每个单词返回“列表元素”的索引(单词所在的位置)

[英]return for each word in element list the index of “element of list” where(the word) is located

I have a list like this, where the first number in string of each element is exactly the index of each element: 我有一个这样的列表,其中每个元素的字符串中的第一个数字正好是每个元素的索引:

list = [" ","1- make your choice", "2- put something and make", "3- make something happens", "4- giulio took his choice so make","5- make your choice", "6- put something and make", "7- make something happens", "8- giulio took his choice so make","9- make your choice", "10- put something and make", "11- make something happens", "12- giulio took his choice so make"]

I want to return for each word in element list the index of "element of list" where(the word) is located: 我想为元素列表中的每个单词返回“列表元素”所在位置(单词)的索引:

for x in list:
    ....

I mean this something like this: 我的意思是这样的:

position_of_word_in_all_elements_list = set("make": 1,2,3,4,5,6,7,8,9,10,11,12)    

position_of_word_in_all_elements_list = set("your": 1,5,9)

position_of_word_in_all_elements_list = set("giulio":4,8,12)

any suggestions? 有什么建议么?

This will find occurrences for all strings in the input, even such as "1-" etc. But filtering the records you do not like from the result should not be a big deal really: 这将发现输入中所有字符串的出现,甚至是“ 1-”等。但是从结果中过滤掉您不喜欢的记录实际上并不重要:

# find the set of all words (sequences separated by a space) in input
s = set(" ".join(list).split(" "))

# for each word go through input and add index to the 
# list if word is in the element. output list into a dict with
# the word as a key
res = dict((key, [ i for i, value in enumerate(list) if key in value.split(" ")]) for key in s)

{'': [0], 'and': [2, 6, 10], '8-': [8], '11-': [11], '6-': [6], 'something': [2, 3, 6, 7, 10, 11], 'your': [1, 5, 9], 'happens': [3, 7, 11], 'giulio': [4, 8, 12], 'make': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], '4-': [4], '2-': [2], 'his': [4, 8, 12], '9-': [9], '10-': [10], '7-': [7], '12-': [12], 'took': [4, 8, 12], 'put': [2, 6, 10], 'choice': [1, 4, 5, 8, 9, 12], '5-': [5], 'so': [4, 8, 12], '3-': [3], '1-': [1]} {'':[0],'and':[2,6,10],'8-':[8],'11-':[11],'6-':[6],'某物:[2,3,6,6,7,10,11],'您的':[1,5,9],'happens':[3,7,11],'giulio':[4,8,12] ,'make':[1、2、3、4、5、6、7、8、9、10、11、12],'4-':[4],'2-':[2],'他的”:[4、8、12],“ 9-”:[9],“ 10-”:[10],“ 7-”:[7],“ 12-”:[12],“接听” :[4,8,12],'put':[2,6,10],'choice':[1,4,5,8,9,12],'5-':[5],'so ':[4、8、12],“ 3-”:[3],“ 1-”:[1]}

First of all rename your list to not interfere with Python builtin stuff so 首先,将您的列表重命名为不干扰Python内置内容,因此

>>> from collections import defaultdict
>>> li = [" ","1- make your choice", "2- put something and make", "3- make something happens", "4- giulio took his choice so make","5- make your choice", "6- put something and make", "7- make something happens", "8- giulio took his choice so make","9- make your choice", "10- put something and make", "11- make something happens", "12- giulio took his choice so make"]`
>>> dd = defaultdict(list)
>>> for l in li:
        try: # this is ugly hack to skip the " " value
            index,words = l.split('-')
        except ValueError:
            continue
        word_list = words.strip().split()
        for word in word_list:
            dd[word].append(index)
>>> dd['make']
['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12']

what defaultdict does: it works like normal dictionary as long as the key (word in our case) is present in the dictionary. defaultdict的作用:只要字典中存在键(在本例中为单词),它就可以像普通字典一样工作。 If the key doesnt exist, it creates it with its value corresponding to, in our case empty list, as specified when you declare it dd = defaultdict(list) . 如果键不存在,它将使用其值(在我们的情况下为空列表)创建它,该值对应于您声明dd = defaultdict(list) I am not the best explainer so I suggest reading on defauldict elsewhere if it is not clear :) 我不是最好的解释者,因此,如果不清楚,我建议在其他地方阅读《默认》。

@Oleg wrote a great nerdy solution. @Oleg编写了一个很棒的书呆子解决方案。 I came up with the following simple method for this problem. 我想出了以下简单方法来解决此问题。

def findIndex(st, lis):
    positions = []
    j = 0
    for x in lis:
        if st in x: 
            positions.append(j)
            j += 1
    return positions

$>>> findIndex('your', list) $ >>> findIndex('your',list)

[1, 5, 9] [1、5、9]

I need to use the number on string to take the ID, and for this i have the solution... but as you remember i have to get all ID for each word in element. 我需要使用字符串上的数字作为ID,为此,我有解决方案...但是,正如您所记得的,我必须获取元素中每个单词的所有ID。

lst = [" ","1- make your choice", "2- put something and make", "3- make something happens", 
"4- giulio took his choice so make","5- make your choice", "6- put something and make", 
"7- make something happens", "8- giulio took his choice so make","9- make your choice", 
"10- put something and make", "11- make something happens", "12- giulio took his choice so make"]

diczio = {} 
abc = " ".join(lst).split(" ")

for x in lst:
    element = x

    for t in abc:
        if len(element) > 0:
            if t in element:
                xs = element.find("-")
                aw = element[0:xs]
                aw = int(aw)
                wer = set()
                wer.add(aw)
                diczio[t] = [wer]
print diczio

The problem is that I got only one ID of all words and I put them in 1 set( i mean wer = set() ) but i need all ID of words: 问题是我只有所有单词的一个ID,并将它们放在1个set中(我的意思是wer = set()),但我需要所有单词的ID:

1 - for example, for the word 'your'i get only ID of last post where the word is located: 1-例如,对于单词“ your”,我仅获得单词所在位置的最新帖子的ID:

'your': [set(['9'])]

but i need: 但是我需要:

'your': [set([1,5,9])]

2- the ID 9 is a string in set and i need it in int, but i get an error if I try to put aw in int: 2- ID 9是set中的字符串,我需要int,但是如果我尝试将aw放入int,则会收到错误消息:

aw = int(aw)

error 错误

ValueError: invalid literal for int() with base 10: ''

any suggestions? 有什么建议么?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM