comparing lists and storing index values if lists match

Question

I have two lists:

wordsindict

list2

 wordsindict = ['somewhere', 'over', 'rainbow', 'bluebirds', 'fly', 'birds', 'fly', 'over', 'rainbow', 'why', 'why', 'double', 'time', 'population', 'long', 'takes', 'population', 'double', 'size', 'whoa', 'full', 'rainbow', 'way', 'double', 'rainbow', 'double', 'rainbow', 'way'] list2 = [['somewhere', 'over', 'rainbow', 'bluebirds', 'fly', 'birds', 'fly', 'over', 'rainbow', 'why', 'why'], ['double', 'time', 'population', 'long', 'takes', 'population', 'double', 'size'], ['whoa', 'full', 'rainbow', 'way', 'double', 'rainbow', 'double', 'rainbow', 'way']]

I am taking the words(removing duplicates) that are within wordsindict and seeing if they are contained within list2. If they are, I wish to take the index value of the word in wordsindict . Beneath is the code that I currently have:

listindex = {}
for word in wordsindict:
    listindex[word] = []
    for splittedLines_list in list2:
        index_list = []
        for i,j in enumerate(splittedLines_list):
            if j == word:
                index_list.append(i)
        listindex[word].append(index_list)

this code produces this output:

{'fly': [[4, 6], [], []], 'rainbow': [[2, 8], [], [2, 5, 7]], 'full': [[], [], [1]], 'bluebirds': [[3], [], []], 'takes': [[], [4], []], 'somewhere': [[0], [], []], 'double': [[], [0, 6], [4, 6]], 'over': [[1, 7], [], []], 'long': [[], [3], []], 'why': [[9, 10], [], []], 'whoa': [[], [], [0]], 'way': [[], [], [3, 8]], 'time': [[], [1], []], 'size': [[], [7], []], 'birds': [[5], [], []], 'population': [[], [2, 5], []]}

it takes the words from wordsindict and stores their index value. This is incorrect as there are only 3 sublists within list2. It gives each index value its own list:

eg 'population': [[], [2, 5], []

                     ^     ^     ^
                     0     1     2

Here you can see that population does appear within the first index value, but instead the words index value within the second sublist is recorded instead of simply 'population': [1, 1] .

Put simply, I want the index value from list2 (0-2) to be appended, and if the word from wordsindict does appear more than once in list2 then append the index value again from where it was found.

wordsindict contains they keys and list2 should be searched for the occurrences.

If you need any more information, please do not hesitate to ask!

Answer 1

If I understand the question correctly I think this is what you were looking for:

wordsindict = ['somewhere', 'over', 'rainbow', 'bluebirds', 'fly', 'birds', 'fly', 'over', 'rainbow', 'why', 'why', 'double', 'time', 'population', 'long', 'takes', 'population', 'double', 'size', 'whoa', 'full', 'rainbow', 'way', 'double', 'rainbow', 'double', 'rainbow', 'way']

list2 = [['somewhere', 'over', 'rainbow', 'bluebirds', 'fly', 'birds', 'fly', 'over', 'rainbow', 'why', 'why'], ['double', 'time', 'population', 'long', 'takes', 'population', 'double', 'size'], ['whoa', 'full', 'rainbow', 'way', 'double', 'rainbow', 'double', 'rainbow', 'way']]
d = {}
for word in set(wordsindict):
    d[word] = []
    for i, l in enumerate(list2):
        for wordy_word in l:
            if wordy_word == word:
                d[word].append(i)
print(d)

output:

{'why': [0, 0], 'way': [2, 2], 'whoa': [2], 'full': [2], 'birds': [0], 'size': [
1], 'time': [1], 'long': [1], 'population': [1, 1], 'fly': [0, 0], 'somewhere':
[0], 'takes': [1], 'rainbow': [0, 0, 2, 2, 2], 'bluebirds': [0], 'double': [1, 1
, 2, 2], 'over': [0, 0]}

If you want the list index with the location in that list

wordsindict = ['somewhere', 'over', 'rainbow', 'bluebirds', 'fly', 'birds', 'fly', 'over', 'rainbow', 'why', 'why', 'double', 'time', 'population', 'long', 'takes', 'population', 'double', 'size', 'whoa', 'full', 'rainbow', 'way', 'double', 'rainbow', 'double', 'rainbow', 'way']

list2 = [['somewhere', 'over', 'rainbow', 'bluebirds', 'fly', 'birds', 'fly', 'over', 'rainbow', 'why', 'why'], ['double', 'time', 'population', 'long', 'takes', 'population', 'double', 'size'], ['whoa', 'full', 'rainbow', 'way', 'double', 'rainbow', 'double', 'rainbow', 'way']]
d = {}
for word in set(wordsindict):
    d[word] = []
    for i, l in enumerate(list2):
        for j, wordy_word in enumerate(l):
            if wordy_word == word:
                #new_d = {i: j}
                #tuples probably better here

                d[word].append((i, j)

comparing lists and storing index values if lists match

Question

1 answers

solution1
1 ACCPTED 2016-04-03 20:00:01

comparing lists and storing index values if lists match

Question

1 answers

solution1 1 ACCPTED 2016-04-03 20:00:01

solution1
1 ACCPTED 2016-04-03 20:00:01