简体   繁体   中英

comparing lists and storing index values if lists match

I have two lists:

  • wordsindict
  • list2

     wordsindict = ['somewhere', 'over', 'rainbow', 'bluebirds', 'fly', 'birds', 'fly', 'over', 'rainbow', 'why', 'why', 'double', 'time', 'population', 'long', 'takes', 'population', 'double', 'size', 'whoa', 'full', 'rainbow', 'way', 'double', 'rainbow', 'double', 'rainbow', 'way'] list2 = [['somewhere', 'over', 'rainbow', 'bluebirds', 'fly', 'birds', 'fly', 'over', 'rainbow', 'why', 'why'], ['double', 'time', 'population', 'long', 'takes', 'population', 'double', 'size'], ['whoa', 'full', 'rainbow', 'way', 'double', 'rainbow', 'double', 'rainbow', 'way']] 

I am taking the words(removing duplicates) that are within wordsindict and seeing if they are contained within list2. If they are, I wish to take the index value of the word in wordsindict . Beneath is the code that I currently have:

listindex = {}
for word in wordsindict:
    listindex[word] = []
    for splittedLines_list in list2:
        index_list = []
        for i,j in enumerate(splittedLines_list):
            if j == word:
                index_list.append(i)
        listindex[word].append(index_list)

this code produces this output:

{'fly': [[4, 6], [], []], 'rainbow': [[2, 8], [], [2, 5, 7]], 'full': [[], [], [1]], 'bluebirds': [[3], [], []], 'takes': [[], [4], []], 'somewhere': [[0], [], []], 'double': [[], [0, 6], [4, 6]], 'over': [[1, 7], [], []], 'long': [[], [3], []], 'why': [[9, 10], [], []], 'whoa': [[], [], [0]], 'way': [[], [], [3, 8]], 'time': [[], [1], []], 'size': [[], [7], []], 'birds': [[5], [], []], 'population': [[], [2, 5], []]}

it takes the words from wordsindict and stores their index value. This is incorrect as there are only 3 sublists within list2. It gives each index value its own list:

eg 'population': [[], [2, 5], []

                     ^     ^     ^
                     0     1     2

Here you can see that population does appear within the first index value, but instead the words index value within the second sublist is recorded instead of simply 'population': [1, 1] .

Put simply, I want the index value from list2 (0-2) to be appended, and if the word from wordsindict does appear more than once in list2 then append the index value again from where it was found.

wordsindict contains they keys and list2 should be searched for the occurrences.

If you need any more information, please do not hesitate to ask!

If I understand the question correctly I think this is what you were looking for:

wordsindict = ['somewhere', 'over', 'rainbow', 'bluebirds', 'fly', 'birds', 'fly', 'over', 'rainbow', 'why', 'why', 'double', 'time', 'population', 'long', 'takes', 'population', 'double', 'size', 'whoa', 'full', 'rainbow', 'way', 'double', 'rainbow', 'double', 'rainbow', 'way']

list2 = [['somewhere', 'over', 'rainbow', 'bluebirds', 'fly', 'birds', 'fly', 'over', 'rainbow', 'why', 'why'], ['double', 'time', 'population', 'long', 'takes', 'population', 'double', 'size'], ['whoa', 'full', 'rainbow', 'way', 'double', 'rainbow', 'double', 'rainbow', 'way']]
d = {}
for word in set(wordsindict):
    d[word] = []
    for i, l in enumerate(list2):
        for wordy_word in l:
            if wordy_word == word:
                d[word].append(i)
print(d)

output:

{'why': [0, 0], 'way': [2, 2], 'whoa': [2], 'full': [2], 'birds': [0], 'size': [
1], 'time': [1], 'long': [1], 'population': [1, 1], 'fly': [0, 0], 'somewhere':
[0], 'takes': [1], 'rainbow': [0, 0, 2, 2, 2], 'bluebirds': [0], 'double': [1, 1
, 2, 2], 'over': [0, 0]}

If you want the list index with the location in that list

wordsindict = ['somewhere', 'over', 'rainbow', 'bluebirds', 'fly', 'birds', 'fly', 'over', 'rainbow', 'why', 'why', 'double', 'time', 'population', 'long', 'takes', 'population', 'double', 'size', 'whoa', 'full', 'rainbow', 'way', 'double', 'rainbow', 'double', 'rainbow', 'way']

list2 = [['somewhere', 'over', 'rainbow', 'bluebirds', 'fly', 'birds', 'fly', 'over', 'rainbow', 'why', 'why'], ['double', 'time', 'population', 'long', 'takes', 'population', 'double', 'size'], ['whoa', 'full', 'rainbow', 'way', 'double', 'rainbow', 'double', 'rainbow', 'way']]
d = {}
for word in set(wordsindict):
    d[word] = []
    for i, l in enumerate(list2):
        for j, wordy_word in enumerate(l):
            if wordy_word == word:
                #new_d = {i: j}
                #tuples probably better here

                d[word].append((i, j)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM