[英]read text file in python and extract specific value in each line?
I have a text file that each line of it is as follows:我有一个文本文件,它的每一行如下:
n:1 mse_avg:8.46 mse_y:12.69 mse_u:0.00 mse_v:0.00 psnr_avg:38.86 psnr_y:37.10 psnr_u:inf psnr_v:inf
n:2 mse_avg:12.20 mse_y:18.30 mse_u:0.00 mse_v:0.00 psnr_avg:37.27 psnr_y:35.51 psnr_u:inf psnr_v:inf
I need to read each line extract psnr_y and its value in a matrix.我需要读取每一行提取 psnr_y 及其在矩阵中的值。 does python have any other functions for reading a text file?
python 还有其他读取文本文件的功能吗? I need to extract psnr_y from each line.
我需要从每一行中提取 psnr_y 。 I have a matlab code for this, but I need a python code and I am not familiar with functions in python.
我有一个 matlab 代码,但我需要一个 python 代码,我不熟悉 python 中的功能。 could you please help me with this issue?
你能帮我解决这个问题吗? this is the matlab code:
这是 matlab 代码:
opt = {'Delimiter',{':',' '}};
fid = fopen('data.txt','rt');
nmc = nnz(fgetl(fid)==':');
frewind(fid);
fmt = repmat('%s%f',1,nmc);
tmp = textscan(fid,fmt,opt{:});
fclose(fid);
fnm = [tmp{:,1:2:end}];
out = cell2struct(tmp(:,2:2:end),fnm(1,:),2)
use regular expression使用正则表达式
r'psnr_y:([\d.]+)'
on each line read在每一行读取
and extract match.group(1)
from the result并从结果中提取
match.group(1)
if needed convert to float: float(match.group(1))
如果需要转换为浮点数:
float(match.group(1))
Since I hate regex, I would suggest:由于我讨厌正则表达式,我建议:
s = 'n:1 mse_avg:8.46 mse_y:12.69 mse_u:0.00 mse_v:0.00 psnr_avg:38.86 psnr_y:37.10 psnr_u:inf psnr_v:inf \nn:2 mse_avg:12.20 mse_y:18.30 mse_u:0.00 mse_v:0.00 psnr_avg:37.27 psnr_y:35.51 psnr_u:inf psnr_v:inf'
lst = s.split('\n')
out = []
for line in lst:
psnr_y_pos = line.index('psnr_y:')
next_key = line[psnr_y_pos:].index(' ')
psnr_y = line[psnr_y_pos+7:psnr_y_pos+next_key]
out.append(psnr_y)
print(out)
out
is a list of the values of psnr_y
in each line. out
是每行中psnr_y
值的列表。
You can use regex like below:您可以使用如下正则表达式:
import re
with open('textfile.txt') as f:
a = f.readlines()
pattern = r'psnr_y:([\d.]+)'
for line in a:
print(re.search(pattern, line)[1])
This code will return only psnr_y's value.此代码将仅返回 psnr_y 的值。 you can remove [1] and change it with [0] to get the full string like "psnr_y:37.10".
您可以删除 [1] 并用 [0] 更改它以获得完整的字符串,如“psnr_y:37.10”。 If you want to assign it into a list, the code would look like this:
如果要将其分配到列表中,代码如下所示:
import re
a_list = []
with open('textfile.txt') as f:
a = f.readlines()
pattern = r'psnr_y:([\d.]+)'
for line in a:
a_list.append(re.search(pattern, line)[1])
For a simple answer with no need to import additional modules, you could try:对于无需导入其他模块的简单答案,您可以尝试:
rows = []
with open("my_file", "r") as f:
for row in f.readlines():
value_pairs = row.strip().split(" ")
print(value_pairs)
values = {pair.split(":")[0]: pair.split(":")[1] for pair in value_pairs}
print(values["psnr_y"])
rows.append(values)
print(rows)
This gives you a list of dictionaries (basically JSON structure but with python objects).这为您提供了一个字典列表(基本上是 JSON 结构,但带有 python 对象)。 This probably won't be the fastest solution but the structure is nice and you don't have to use regex
这可能不是最快的解决方案,但结构很好,您不必使用正则表达式
import fileinput
import re
for line in fileinput.input():
row = dict([s.split(':') for s in re.findall('[\S]+:[\S]+', line)])
print(row['psnr_y'])
To verify,验证,
python script_name.py < /path/to/your/dataset.txt
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.