简体   繁体   English

Python:根据文件内容构建字典

[英]Python: Build a dictionary from a file's contents

Say that I have a file of names and values with entries like this: 假设我有一个名称和值的文件,其条目如下:

lasker:22,45,77,101
kramnik:45,22,15,105

What's the most Pythonic way to get them into a dictionary with the name as the key and the values as a list like this: 什么是Pythonic最方便的方法是将它们作为键输入字典,将值作为列表如下:

{ 'lasker': (22,45,77,101), 'kramnik': (45,22,15,105) }

EDIT 编辑

And is there anyway to iterate through them in the order I read them from the file or would this require a different data structure? 反正是按照我从文件中读取它们的顺序迭代它们还是需要不同的数据结构?

I think it is pretty clear how this code works: 我认为这段代码的工作方式非常清楚:

def get_entries( infile ):
    with open( infile, 'rt') as file:
        for line in file:
            name, nums = line.split(':', 1)
            yield name, tuple(int(x) for x in nums.split(','))

# dict takes a sequence of  `(key, value)` pairs and turns in into a dict
print dict(get_entries( infile ))

Writing a generator that yields pairs and passing it to dict is a extremely useful pattern. 编写生成对并将其传递给dict的生成器是一种非常有用的模式。

If you just want to iterate over the pairs you can do this directly: 如果您只想迭代对,可以直接执行此操作:

for name, nums in get_entries( infile ):
    print name, nums

but if you need dict access later but also ordering you can simply replace the dict with a OrderedDict : 但是如果您稍后需要dict访问,但也可以订购,只需用OrderedDict替换dict

from collections import OrderedDict
print OrderedDict(get_entries( infile ))

No need to care about lines with a regex: 无需关心正则表达式的行:

import re

pat = re.compile('([a-z]+)\s*:\s*(\d+(?:\s*,\s*\d+)*)')

with open('rara.txt') as f:
    dic = dict((ma.group(1),map(int,ma.group(2).split(','))) for ma in pat.finditer(f.read()))

print dic

Tested with following text in 'rara.txt' file's text: 在'rara.txt'文件的文本中测试了以下文字:

lasker :  22,45,  77,101 kramnik:888 ,22,15,105  kramniu :45,22,    3433,105 6765433 laskooo:22,45, 77 , 101  kooni:
45, 78 45kramndde:45,334 ,15,105 tasku: 22,45  ,7,101 krammma:  1105oberon glomo:22, 3478,77 ,101 draumnik:45,105 
toyku:22,45,7,101 solo
   ytrmmma:1105oberon radabidadada lftyker:22,3478,7,101

Result 结果

{'laskooo': [22, 45, 77, 101], 'tasku': [22, 45, 7, 101], 'krammma': [1105], 'glomo': [22, 3478, 77, 101], 'kramniu': [45, 22, 3433, 105], 'kooni': [45, 78], 'lftyker': [22, 3478, 7, 101], 'toyku': [22, 45, 7, 101], 'kramnik': [888, 22, 15, 105], 'draumnik': [45, 105], 'ytrmmma': [1105], 'lasker': [22, 45, 77, 101], 'kramndde': [45, 334, 15, 105]}

EDIT: I modified the regex pattern (added \\s* ) and the 'rara.txt' file's text 编辑:我修改了正则表达式模式(添加了\\ s *)和'rara.txt'文件的文本

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM