简体   繁体   中英

return for each word in element list the index of “element of list” where(the word) is located

I have a list like this, where the first number in string of each element is exactly the index of each element:

list = [" ","1- make your choice", "2- put something and make", "3- make something happens", "4- giulio took his choice so make","5- make your choice", "6- put something and make", "7- make something happens", "8- giulio took his choice so make","9- make your choice", "10- put something and make", "11- make something happens", "12- giulio took his choice so make"]

I want to return for each word in element list the index of "element of list" where(the word) is located:

for x in list:
    ....

I mean this something like this:

position_of_word_in_all_elements_list = set("make": 1,2,3,4,5,6,7,8,9,10,11,12)    

position_of_word_in_all_elements_list = set("your": 1,5,9)

position_of_word_in_all_elements_list = set("giulio":4,8,12)

any suggestions?

This will find occurrences for all strings in the input, even such as "1-" etc. But filtering the records you do not like from the result should not be a big deal really:

# find the set of all words (sequences separated by a space) in input
s = set(" ".join(list).split(" "))

# for each word go through input and add index to the 
# list if word is in the element. output list into a dict with
# the word as a key
res = dict((key, [ i for i, value in enumerate(list) if key in value.split(" ")]) for key in s)

{'': [0], 'and': [2, 6, 10], '8-': [8], '11-': [11], '6-': [6], 'something': [2, 3, 6, 7, 10, 11], 'your': [1, 5, 9], 'happens': [3, 7, 11], 'giulio': [4, 8, 12], 'make': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], '4-': [4], '2-': [2], 'his': [4, 8, 12], '9-': [9], '10-': [10], '7-': [7], '12-': [12], 'took': [4, 8, 12], 'put': [2, 6, 10], 'choice': [1, 4, 5, 8, 9, 12], '5-': [5], 'so': [4, 8, 12], '3-': [3], '1-': [1]}

First of all rename your list to not interfere with Python builtin stuff so

>>> from collections import defaultdict
>>> li = [" ","1- make your choice", "2- put something and make", "3- make something happens", "4- giulio took his choice so make","5- make your choice", "6- put something and make", "7- make something happens", "8- giulio took his choice so make","9- make your choice", "10- put something and make", "11- make something happens", "12- giulio took his choice so make"]`
>>> dd = defaultdict(list)
>>> for l in li:
        try: # this is ugly hack to skip the " " value
            index,words = l.split('-')
        except ValueError:
            continue
        word_list = words.strip().split()
        for word in word_list:
            dd[word].append(index)
>>> dd['make']
['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12']

what defaultdict does: it works like normal dictionary as long as the key (word in our case) is present in the dictionary. If the key doesnt exist, it creates it with its value corresponding to, in our case empty list, as specified when you declare it dd = defaultdict(list) . I am not the best explainer so I suggest reading on defauldict elsewhere if it is not clear :)

@Oleg wrote a great nerdy solution. I came up with the following simple method for this problem.

def findIndex(st, lis):
    positions = []
    j = 0
    for x in lis:
        if st in x: 
            positions.append(j)
            j += 1
    return positions

$>>> findIndex('your', list)

[1, 5, 9]

I need to use the number on string to take the ID, and for this i have the solution... but as you remember i have to get all ID for each word in element.

lst = [" ","1- make your choice", "2- put something and make", "3- make something happens", 
"4- giulio took his choice so make","5- make your choice", "6- put something and make", 
"7- make something happens", "8- giulio took his choice so make","9- make your choice", 
"10- put something and make", "11- make something happens", "12- giulio took his choice so make"]

diczio = {} 
abc = " ".join(lst).split(" ")

for x in lst:
    element = x

    for t in abc:
        if len(element) > 0:
            if t in element:
                xs = element.find("-")
                aw = element[0:xs]
                aw = int(aw)
                wer = set()
                wer.add(aw)
                diczio[t] = [wer]
print diczio

The problem is that I got only one ID of all words and I put them in 1 set( i mean wer = set() ) but i need all ID of words:

1 - for example, for the word 'your'i get only ID of last post where the word is located:

'your': [set(['9'])]

but i need:

'your': [set([1,5,9])]

2- the ID 9 is a string in set and i need it in int, but i get an error if I try to put aw in int:

aw = int(aw)

error

ValueError: invalid literal for int() with base 10: ''

any suggestions?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM