Python3-遍历字典，找到特定的动态值

Question

我有以下辞典：


wordPos = {}
words = [...] #Removed for simplicity

for i, word in enumerate(words):
     wordPos[i] = ({word[5]: word[4]})

最终变成：

>>> wordPos
{0: {1: 'Kontakt'},
 1: {2: 'email@domain.com'}, 
 2: {3: 'domain.com'}}

现在，我正在尝试在上述词典中搜索，如果字符串/表达式存在，则应返回“键”作为值。

因此，例如：

string = "@domain.com"

if string in wordPos.values():
   print("The string: {}, exists in the dictionary. The key for this is: {}".format(string, key))

但是我不确定如何在字典中搜索并返回键（值）。

此外，我不确定是否需要使用RegEx进行实际匹配？

编辑

我可以看出，我需要对自己要做的事情更加具体。

因此，基本上，我逐个单词地读取整个文件，并将每个单词添加到字典中（以及特定单词的行号）-这样就得到了以下结构：

lineNumber:word

例如。 1:'Kontakt'

现在，我试图使用此信息来打开另一个文件并获取该文件的第一个单词（在我的示例中，第一个单词为@domain.com ）。

有了这个第一个单词，我想检查它是否存在于我的字典中（第一次出现）。 如果是这样，我想返回行号。 因此，在我的示例中，对于单词@domain.com ，应返回的行号将为2 。

Answer 1

您可以创建如下功能。 这将返回第一个匹配的行号。

import re

input_dict = {
    0: {1: 'Kontakt'},
    1: {2: 'email@domain.com'},
    2: {3: 'domain.com'}
}

def search_word(regex):
    for k, v in input_dict.items():
        for _, v1 in v.items():
            if re.match(regex, v1):
                return k

print(search_word('domain.com')) # 2 (domain.com)
print(search_word('\w+@domain.com')) # 1 (email@domain.com)

输出：

2
1

Answer 2

如果您确实想在字典中搜索动态值，则需要遍历所有项，检查值是否匹配，然后返回键。 没有办法以更Python化的方式做到这一点。

for key, value in wordPos.items():
    for inner_key, inner_value in value.items():
        if value == string:
            return key

如果已经有单词数组，为什么不只使用index方法呢？

if string in words:
   print(f"The string: {string}, exists. The key for this is: {words.index(string)}")

如果该字符串不存在，则会引发ValueError ，因此可以避免使用if via：

try:
   print(f"The string: {string}, exists. The key for this is: {words.index(string)}")
except ValueError as e:
    pass

Answer 3

一种可能性是使用python内置的sqlite3模块和FTS5全文索引：

import sqlite3

in_memory = sqlite3.connect(':memory:')
c = in_memory.cursor()
c.execute('CREATE VIRTUAL TABLE "ftsentry" USING FTS5 (line_no UNINDEXED, data, tokenize="unicode61 tokenchars \'.\'")')

c.execute("INSERT INTO ftsentry VALUES (?, ?)", (1, 'Kontakt'))
c.execute("INSERT INTO ftsentry VALUES (?, ?)", (2, 'email@domain.com'))
c.execute("INSERT INTO ftsentry VALUES (?, ?)", (3, 'domain.com'))
c.execute("INSERT INTO ftsentry VALUES (?, ?)", (4, 'domain@sample.com'))

l = [*c.execute('SELECT line_no, data FROM ftsentry WHERE data MATCH ? ORDER BY line_no ASC LIMIT 1', ('"@domain.com"', ))]
print(l)

l = [*c.execute('SELECT line_no, data FROM ftsentry WHERE data MATCH ?', ('"kontakt"', ))]
print(l)

打印：

[(2, 'email@domain.com')]
[(1, 'Kontakt')]

Answer 4

您需要遍历值的值（这很少是个好主意），

string = "@domain.com"
for key, word in enumerate(d.values()):
    # We must here make the dict_values a list, and take the first index
    if string in list(word.values())[0]:  
        print("The string: {}, exists in the dictionary. The key for this is: {}".format(string, key))

这是一种糟糕的方法。 如果您可以解释所获得的数据的外观，可能会有更好的方法。 :)

Python3-遍历字典，找到特定的动态值

问题描述

编辑

4 个解决方案

解决方案1
1 已采纳 2019-06-25 15:31:11

解决方案2
0 2019-06-25 15:30:06

解决方案3
0 2019-06-25 15:44:58

解决方案4
0 2019-06-25 15:46:43

Python3-遍历字典，找到特定的动态值

问题描述

编辑

4 个解决方案

解决方案1 1 已采纳 2019-06-25 15:31:11

解决方案2 0 2019-06-25 15:30:06

解决方案3 0 2019-06-25 15:44:58

解决方案4 0 2019-06-25 15:46:43

解决方案1
1 已采纳 2019-06-25 15:31:11

解决方案2
0 2019-06-25 15:30:06

解决方案3
0 2019-06-25 15:44:58

解决方案4
0 2019-06-25 15:46:43