[英]Pythonic way to walk a self-referential dictionary
I have a dictionary where entry values can reference another entry by key eventually ending with no entry for the current value or when "-" is encountered. 我有一个字典,其中条目值可以通过键引用另一个条目,最终没有当前值的条目或遇到“ - ”时。 The goal of this data structure is to find the parent for each entry and also transform "-" into None. 此数据结构的目标是找到每个条目的父级,并将“ - ”转换为None。 For instance take: 例如:
d = {'1': '-', '0': '6', '3': '1', '2': '3', '4': '5', '6': '9'}
My verbose solution is as follows: 我的详细解决方案如下:
d = {'1': '-', '0': '6', '3': '1', '2': '3', '4': '5', '6': '9'}
print(d)
for dis, rep in d.items():
if rep == "-":
d[dis] = None
continue
while rep in d:
rep = d[rep]
if rep == "-":
d[dis] = None
break
else:
d[dis] = rep
print(d)
The output is: 输出是:
{'1': '-', '0': '6', '3': '1', '2': '3', '4': '5', '6': '9'}
{'1': None, '0': '9', '3': None, '2': None, '4': '5', '6': '9'}
The result is correct. 结果是正确的。 The "1" element has no parent and the "2"/"3" element point back to "1". “1”元素没有父元素,“2”/“3”元素指向“1”。 They should also have no parent. 他们也应该没有父母。
Is there a terser pythonic way to accomplish this using Python 3+? 使用Python 3+有没有更简洁的pythonic方法来实现这一目标?
To "walk" the dictionary, just do the lookups in a loop until there are no more: 要“遍历”字典,只需在循环中执行查找,直到不再有:
>>> def walk(d, val):
while val in d:
val = d[val]
return None if val == '-' else val
>>> d = {'1': '-', '0': '6', '3': '1', '2': '3', '4': '5', '6': '9'}
>>> print {k: walk(d, k) for k in d}
{'1': None, '0': '9', '3': None, '2': None, '4': '5', '6': '9'}
You can define a function like this 您可以定义这样的函数
def recursive_get(d, k):
v = d[k]
if v == '-':
v = d[k] = None
elif v in d:
v = d[k] = recursive_get(d, v)
return v
When you use recursive_get
to access a key it will modify the values as it traverses. 当您使用recursive_get
访问密钥时,它将在遍历时修改值。 This means you don't waste time packing up branches that are never needed 这意味着您不会浪费时间来收拾从不需要的分支
>>> d = {'1': '-', '3': '1', '2': '3'}
>>> recursive_get(d, '3')
>>> d
{'1': None, '3': None, '2': '3'} # didn't need to visit '2'
>>> d = {'1': '-', '3': '1', '2': '3'}
>>> recursive_get(d, '2')
>>> d
{'1': None, '3': None, '2': None}
If you wish to just force d
into it's final state, simply loop through all the keys 如果你想强迫d
进入它的最终状态,只需循环遍历所有键
for k in d:
recursive_get(d, k)
I wanted to post some profiling statistics on the three approaches so far: 我想发布一些关于这三种方法的分析统计数据:
Running original procedural solution.
5 function calls in 0.221 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.221 0.221 <string>:1(<module>)
1 0.221 0.221 0.221 0.221 test.py:12(verbose)
1 0.000 0.000 0.221 0.221 {built-in method exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.000 0.000 0.000 0.000 {method 'items' of 'dict' objects}
885213
Running recursive solution.
994022 function calls in 1.252 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 1.252 1.252 <string>:1(<module>)
994018 0.632 0.000 0.632 0.000 test.py:27(recursive)
1 0.620 0.620 1.252 1.252 test.py:35(do_recursive)
1 0.000 0.000 1.252 1.252 {built-in method exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
885213
Running dict comprehension solution.
994023 function calls in 1.665 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.059 0.059 1.665 1.665 <string>:1(<module>)
994018 0.683 0.000 0.683 0.000 test.py:40(walk)
1 0.000 0.000 1.606 1.606 test.py:45(dict_comprehension)
1 0.923 0.923 1.606 1.606 test.py:46(<dictcomp>)
1 0.000 0.000 1.665 1.665 {built-in method exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
885213
Below is the code to run the three approaches: 下面是运行这三种方法的代码:
import cProfile
import csv
import gzip
def gzip_to_text(gzip_file, encoding="ascii"):
with gzip.open(gzip_file) as gzf:
for line in gzf:
yield str(line, encoding)
def verbose(d):
for dis, rep in d.items():
if rep == "-":
d[dis] = None
continue
while rep in d:
rep = d[rep]
if rep == "-":
d[dis] = None
break
else:
d[dis] = rep
return d
def recursive(d, k):
v = d[k]
if v == '-':
v = d[k] = None
elif v in d:
v = d[k] = recursive(d, v)
return v
def do_recursive(d):
for k in d:
recursive(d, k)
return d
def walk(d, val):
while val in d:
val = d[val]
return None if val == '-' else val
def dict_comprehension(d):
return {k : walk(d, k) for k in d}
# public dataset pulled from url: ftp://ftp.ncbi.nih.gov/gene/DATA/gene_history.gz
csvr = csv.reader(gzip_to_text("gene_history.gz"), delimiter="\t", quotechar="\"")
d = {rec[2].strip() : rec[1].strip() for rec in csvr if csvr.line_num > 1}
print("Running original procedural solution.")
cProfile.run('d = verbose(d)')
c = 0
for k, v in d.items():
c += (1 if v is None else 0)
print(c)
print("Running recursive solution.")
cProfile.run('d = do_recursive(d)')
c = 0
for k, v in d.items():
c += (1 if v is None else 0)
print(c)
print("Running dict comprehension solution.")
cProfile.run('d = dict_comprehension(d)')
c = 0
for k, v in d.items():
c += (1 if v is None else 0)
print(c)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.