[英]Python3 - Iterate over dictionary, find specific dynamic value
I have below dict: 我有以下辞典:
wordPos = {}
words = [...] #Removed for simplicity
for i, word in enumerate(words):
wordPos[i] = ({word[5]: word[4]})
Which ultimately becomes: 最终变成:
>>> wordPos
{0: {1: 'Kontakt'},
1: {2: 'email@domain.com'},
2: {3: 'domain.com'}}
Now, I am trying to search in above dictionary and if the string/expression exists, it should return the "key" for the value. 现在,我正在尝试在上述词典中搜索,如果字符串/表达式存在,则应返回“键”作为值。
So, for example: 因此,例如:
string = "@domain.com"
if string in wordPos.values():
print("The string: {}, exists in the dictionary. The key for this is: {}".format(string, key))
However I am not sure how to search within a dictionary, and return the key (of the value). 但是我不确定如何在字典中搜索并返回键(值)。
Furthermore, I am a bit unsure if I need to use RegEx to do the actual matching? 此外,我不确定是否需要使用RegEx进行实际匹配?
I can see that I need to be more specific in regards to what I am trying to do. 我可以看出,我需要对自己要做的事情更加具体。
So basically, I am reading an entire file word by word and adding each word to a dictionary (as well as the line number of the specific word) - thus giving me the following structure: 因此,基本上,我逐个单词地读取整个文件,并将每个单词添加到字典中(以及特定单词的行号)-这样就得到了以下结构:
lineNumber:word
eg. 例如。
1:'Kontakt'
Now what I am trying to do with this information is to open another file and get the first word of that file (in my example, the first word is @domain.com
). 现在,我试图使用此信息来打开另一个文件并获取该文件的第一个单词(在我的示例中,第一个单词为
@domain.com
)。
With this first word, I want to check if it exists in my dictionary (first occurrence). 有了这个第一个单词,我想检查它是否存在于我的字典中(第一次出现)。 If it does, I want to return the line number.
如果是这样,我想返回行号。 So in my example, for the word
@domain.com
, the line number that should be returned would be 2
. 因此,在我的示例中,对于单词
@domain.com
,应返回的行号将为2
。
You can create a function like below. 您可以创建如下功能。 This will return the first matching line number.
这将返回第一个匹配的行号。
import re
input_dict = {
0: {1: 'Kontakt'},
1: {2: 'email@domain.com'},
2: {3: 'domain.com'}
}
def search_word(regex):
for k, v in input_dict.items():
for _, v1 in v.items():
if re.match(regex, v1):
return k
print(search_word('domain.com')) # 2 (domain.com)
print(search_word('\w+@domain.com')) # 1 (email@domain.com)
Output: 输出:
2
1
If you really want to search a dictionary for a dynamic value, you need to iterate through the items, check to see if the values match, and return the key. 如果您确实想在字典中搜索动态值,则需要遍历所有项,检查值是否匹配,然后返回键。 There's no way to do it in a more pythonic way.
没有办法以更Python化的方式做到这一点。
for key, value in wordPos.items():
for inner_key, inner_value in value.items():
if value == string:
return key
If you already have an array of words, why don't you just use the index
method? 如果已经有单词数组,为什么不只使用
index
方法呢?
if string in words:
print(f"The string: {string}, exists. The key for this is: {words.index(string)}")
If the string doesn't exist, it raises a ValueError
, so you could avoid the if
via: 如果该字符串不存在,则会引发
ValueError
,因此可以避免使用if
via:
try:
print(f"The string: {string}, exists. The key for this is: {words.index(string)}")
except ValueError as e:
pass
One possibility is use python builtin sqlite3
module and FTS5
full-text index: 一种可能性是使用python内置的
sqlite3
模块和FTS5
全文索引:
import sqlite3
in_memory = sqlite3.connect(':memory:')
c = in_memory.cursor()
c.execute('CREATE VIRTUAL TABLE "ftsentry" USING FTS5 (line_no UNINDEXED, data, tokenize="unicode61 tokenchars \'.\'")')
c.execute("INSERT INTO ftsentry VALUES (?, ?)", (1, 'Kontakt'))
c.execute("INSERT INTO ftsentry VALUES (?, ?)", (2, 'email@domain.com'))
c.execute("INSERT INTO ftsentry VALUES (?, ?)", (3, 'domain.com'))
c.execute("INSERT INTO ftsentry VALUES (?, ?)", (4, 'domain@sample.com'))
l = [*c.execute('SELECT line_no, data FROM ftsentry WHERE data MATCH ? ORDER BY line_no ASC LIMIT 1', ('"@domain.com"', ))]
print(l)
l = [*c.execute('SELECT line_no, data FROM ftsentry WHERE data MATCH ?', ('"kontakt"', ))]
print(l)
Prints: 打印:
[(2, 'email@domain.com')]
[(1, 'Kontakt')]
You need to iterate through the value of the value (which is rarely a good idea), 您需要遍历值的值(这很少是个好主意),
string = "@domain.com"
for key, word in enumerate(d.values()):
# We must here make the dict_values a list, and take the first index
if string in list(word.values())[0]:
print("The string: {}, exists in the dictionary. The key for this is: {}".format(string, key))
Which is a terrible way of doing this. 这是一种糟糕的方法。 There are probably far better ways if you just can explain how the data you got looks like.
如果您可以解释所获得的数据的外观,可能会有更好的方法。 :)
:)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.