简体   繁体   中英

Python3 - Iterate over dictionary, find specific dynamic value

I have below dict:


wordPos = {}
words = [...] #Removed for simplicity

for i, word in enumerate(words):
     wordPos[i] = ({word[5]: word[4]})

Which ultimately becomes:

>>> wordPos
{0: {1: 'Kontakt'},
 1: {2: 'email@domain.com'}, 
 2: {3: 'domain.com'}}

Now, I am trying to search in above dictionary and if the string/expression exists, it should return the "key" for the value.

So, for example:

string = "@domain.com"

if string in wordPos.values():
   print("The string: {}, exists in the dictionary. The key for this is: {}".format(string, key))

However I am not sure how to search within a dictionary, and return the key (of the value).

Furthermore, I am a bit unsure if I need to use RegEx to do the actual matching?

Edit

I can see that I need to be more specific in regards to what I am trying to do.

So basically, I am reading an entire file word by word and adding each word to a dictionary (as well as the line number of the specific word) - thus giving me the following structure:

lineNumber:word 

eg. 1:'Kontakt'

Now what I am trying to do with this information is to open another file and get the first word of that file (in my example, the first word is @domain.com ).

With this first word, I want to check if it exists in my dictionary (first occurrence). If it does, I want to return the line number. So in my example, for the word @domain.com , the line number that should be returned would be 2 .

You can create a function like below. This will return the first matching line number.

import re

input_dict = {
    0: {1: 'Kontakt'},
    1: {2: 'email@domain.com'},
    2: {3: 'domain.com'}
}

def search_word(regex):
    for k, v in input_dict.items():
        for _, v1 in v.items():
            if re.match(regex, v1):
                return k

print(search_word('domain.com')) # 2 (domain.com)
print(search_word('\w+@domain.com')) # 1 (email@domain.com)



Output:

2
1

If you really want to search a dictionary for a dynamic value, you need to iterate through the items, check to see if the values match, and return the key. There's no way to do it in a more pythonic way.

for key, value in wordPos.items():
    for inner_key, inner_value in value.items():
        if value == string:
            return key

If you already have an array of words, why don't you just use the index method?

if string in words:
   print(f"The string: {string}, exists. The key for this is: {words.index(string)}")

If the string doesn't exist, it raises a ValueError , so you could avoid the if via:

try:
   print(f"The string: {string}, exists. The key for this is: {words.index(string)}")
except ValueError as e:
    pass

One possibility is use python builtin sqlite3 module and FTS5 full-text index:

import sqlite3

in_memory = sqlite3.connect(':memory:')
c = in_memory.cursor()
c.execute('CREATE VIRTUAL TABLE "ftsentry" USING FTS5 (line_no UNINDEXED, data, tokenize="unicode61 tokenchars \'.\'")')

c.execute("INSERT INTO ftsentry VALUES (?, ?)", (1, 'Kontakt'))
c.execute("INSERT INTO ftsentry VALUES (?, ?)", (2, 'email@domain.com'))
c.execute("INSERT INTO ftsentry VALUES (?, ?)", (3, 'domain.com'))
c.execute("INSERT INTO ftsentry VALUES (?, ?)", (4, 'domain@sample.com'))

l = [*c.execute('SELECT line_no, data FROM ftsentry WHERE data MATCH ? ORDER BY line_no ASC LIMIT 1', ('"@domain.com"', ))]
print(l)

l = [*c.execute('SELECT line_no, data FROM ftsentry WHERE data MATCH ?', ('"kontakt"', ))]
print(l)

Prints:

[(2, 'email@domain.com')]
[(1, 'Kontakt')]

You need to iterate through the value of the value (which is rarely a good idea),

string = "@domain.com"
for key, word in enumerate(d.values()):
    # We must here make the dict_values a list, and take the first index
    if string in list(word.values())[0]:  
        print("The string: {}, exists in the dictionary. The key for this is: {}".format(string, key))

Which is a terrible way of doing this. There are probably far better ways if you just can explain how the data you got looks like. :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM