简体   繁体   English

解析文本文件中的键值对

[英]Parse key value pairs in a text file

I am a newbie with Python and I search how to parse a.txt file.我是 Python 的新手,我搜索如何解析 a.txt 文件。 My.txt file is a namelist with computation informations like: My.txt 文件是一个包含计算信息的名单,例如:

myfile.txt我的文件.txt

var0 = 16变量 0 = 16
var1 = 1.12434E10 var1 = 1.12434E10
var2 = -1.923E-3 var2 = -1.923E-3
var3 = 920变量 3 = 920

How to read the values and put them in myvar0, myvar1, myvar2, myvar3 in python?如何读取值并将它们放入myvar0, myvar1, myvar2, myvar3 ,myvar3?

I suggest storing the values in a dictionary instead of in separate local variables: 我建议将值存储在字典中而不是存储在单独的局部变量中:

myvars = {}
with open("namelist.txt") as myfile:
    for line in myfile:
        name, var = line.partition("=")[::2]
        myvars[name.strip()] = float(var)

Now access them as myvars["var1"] . 现在将它们作为myvars["var1"] If the names are all valid python variable names, you can put this below: 如果名称都是有效的python变量名,可以将其放在下面:

names = type("Names", [object], myvars)

and access the values as eg names.var1 . 并访问值,例如names.var1

I personally solved this by creating a .py file that just contains all the parameters as variables - then did: 我个人通过创建一个只包含所有参数作为变量的.py文件来解决这个问题 - 然后做了:

include PARAMETERS.py

in the program modules that need the parameters. 在需要参数的程序模块中。

It's a bit ugly, but VERY simple and easy to work with. 它有点难看,但非常简单易用。

As @kev suggests, the configparser module is the way to go. 正如@kev建议的那样,configparser模块是可行的方法。

However in some scenarios (a bit ugly, I admit) but very simple and effective way to do to this is to rename myfile.txt to myfile.py and do a from myfile import * (after you fix the typo var 0 -> var0 ) 但是在某些情况下(有点难看,我承认)但是非常简单有效的方法是将myfile.txt重命名为myfile.pyfrom myfile import *执行a(在修复错字var0 var 0 - > var0

However, this is very insecure , so if the file is from an external source or can be written by a malicious attacker, use something that validates the data instead of executing it blindly. 然而,这是非常不安全的 ,因此,如果该文件是从外部来源,或可以通过恶意攻击者编写,使用的东西 ,对数据进行验证,而不是盲目地执行它。

If there are multiple comma-separated values on a single line, here's code to parse that out: 如果一行上有多个以逗号分隔的值,这里是解析它的代码:

    res = {}                                                                                                                                                                                             

    pairs = args.split(", ")                                                                                                                                                                             
    for p in pairs:                                                                                                                                                                                      
        var, val = p.split("=")                                                                                                                                                                          
        res[var] = val                                                                                                                                                                                   

Use pandas.read_csv when the file format becomes more fancy (like comments). 当文件格式变得更加花哨时(如注释),请使用pandas.read_csv

val = u'''var0 = 16
var1 = 1.12434E10
var2 = -1.923E-3
var3 = 920'''
print(pandas.read_csv(StringIO(val), # or read_csv('myfile.txt',
            delimiter='\s*=\s*',
            header=None,
            names=['key','value'],
            dtype=dict(key=numpy.object,value=numpy.object), # or numpy.float64
            index_col=['key']).to_dict()['value'])
# prints {u'var1': u'1.12434E10', u'var0': u'16', u'var3': u'920', u'var2': u'-1.923E-3'}

Dict comprehensions ( PEP 274 ) can be used for a shorter expression (60 characters): 字典理解( PEP 274 )可用于较短的表达(60个字符):

d = {k:float(v) for k, v in (l.split('=') for l in open(f))}

EDIT: shortened from 72 to 60 characters thanks to @jmb suggestion (avoid .readlines() ). 编辑:由于@jmb建议(避免.readlines() ),从72个字符缩短到60个字符。

Similar to @lauritz-v-thaulow but, just a line by line read into a variable.与@lauritz-v-thaulow 类似,但只是逐行读入变量。

Here is a simple Copy-Pasta so you can understand a bit more.这是一个简单的 Copy-Pasta,因此您可以了解更多。
As the config file has to be a specific format.由于配置文件必须是特定格式。

import os

# Example creating an valid temp test file to get a better result. 
MY_CONFIG = os.path.expanduser('~/.test_credentials')
with open(MY_CONFIG, "w") as f:
    f.write("API_SECRET_KEY=123456789")
    f.write(os.linesep)
    f.write("API_SECRET_CONTENT=12345678")

myvars = {}
with open(MY_CONFIG, "r") as myfile:
    for line in myfile:
        line = line.strip()
        name, var = line.partition("=")[::2]
        myvars[name.strip()] = str(var)

# Iterate thru all the items created.
for k, v in myvars.items():
    print("{} | {}".format(k, v))

# API_SECRET_KEY | 123456789
# API_SECRET_CONTENT | 12345678

# Access the API_SECRET_KEY item directly
print("{}".format(myvars['API_SECRET_KEY']))

# 123456789

# Access the API_SECRET_CONTENT item directly
print("{}".format(myvars['API_SECRET_CONTENT']))

# 12345678

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM