read text file in python and extract specific value in each line?

Question

I have a text file that each line of it is as follows:

 n:1 mse_avg:8.46 mse_y:12.69 mse_u:0.00 mse_v:0.00 psnr_avg:38.86 psnr_y:37.10 psnr_u:inf psnr_v:inf 
 n:2 mse_avg:12.20 mse_y:18.30 mse_u:0.00 mse_v:0.00 psnr_avg:37.27 psnr_y:35.51 psnr_u:inf psnr_v:inf

I need to read each line extract psnr_y and its value in a matrix. does python have any other functions for reading a text file? I need to extract psnr_y from each line. I have a matlab code for this, but I need a python code and I am not familiar with functions in python. could you please help me with this issue? this is the matlab code:

opt = {'Delimiter',{':',' '}};
fid = fopen('data.txt','rt');
nmc = nnz(fgetl(fid)==':');
frewind(fid);
fmt = repmat('%s%f',1,nmc);
tmp = textscan(fid,fmt,opt{:});
fclose(fid);
fnm = [tmp{:,1:2:end}];
out = cell2struct(tmp(:,2:2:end),fnm(1,:),2)

Answer 1

use regular expression

r'psnr_y:([\d.]+)'

on each line read

and extract match.group(1) from the result

if needed convert to float: float(match.group(1))

Answer 2

Since I hate regex, I would suggest:

s = 'n:1 mse_avg:8.46 mse_y:12.69 mse_u:0.00 mse_v:0.00 psnr_avg:38.86 psnr_y:37.10 psnr_u:inf psnr_v:inf \nn:2 mse_avg:12.20 mse_y:18.30 mse_u:0.00 mse_v:0.00 psnr_avg:37.27 psnr_y:35.51 psnr_u:inf psnr_v:inf' 
lst = s.split('\n')
out = []
for line in lst:
  psnr_y_pos = line.index('psnr_y:')
  next_key = line[psnr_y_pos:].index(' ')
  psnr_y = line[psnr_y_pos+7:psnr_y_pos+next_key]
  out.append(psnr_y)
print(out)

out is a list of the values of psnr_y in each line.

Answer 3

You can use regex like below:

import re

with open('textfile.txt') as f:
    a = f.readlines()
    pattern = r'psnr_y:([\d.]+)'
    for line in a:
        print(re.search(pattern, line)[1])

This code will return only psnr_y's value. you can remove [1] and change it with [0] to get the full string like "psnr_y:37.10". If you want to assign it into a list, the code would look like this:

import re

a_list = []

with open('textfile.txt') as f:
    a = f.readlines()
    pattern = r'psnr_y:([\d.]+)'
    for line in a:
        a_list.append(re.search(pattern, line)[1])

Answer 4

For a simple answer with no need to import additional modules, you could try:

rows = []
with open("my_file", "r") as f:
    for row in f.readlines():
        value_pairs = row.strip().split(" ")
        print(value_pairs)
        values = {pair.split(":")[0]: pair.split(":")[1] for pair in value_pairs}
        print(values["psnr_y"])
        rows.append(values)

print(rows)

This gives you a list of dictionaries (basically JSON structure but with python objects). This probably won't be the fastest solution but the structure is nice and you don't have to use regex

Answer 5

import fileinput
import re

for line in fileinput.input():
    row = dict([s.split(':') for s in re.findall('[\S]+:[\S]+', line)])
    print(row['psnr_y'])

To verify,

python script_name.py < /path/to/your/dataset.txt

read text file in python and extract specific value in each line?

Question

5 answers

solution1
1 2021-05-04 15:37:06

solution2
1 2021-05-04 15:39:23

solution3
1 ACCPTED 2021-05-04 15:43:31

solution4
1 2021-05-04 15:59:21

solution5
0 2021-05-04 15:38:36

read text file in python and extract specific value in each line?

Question

5 answers

solution1 1 2021-05-04 15:37:06

solution2 1 2021-05-04 15:39:23

solution3 1 ACCPTED 2021-05-04 15:43:31

solution4 1 2021-05-04 15:59:21

solution5 0 2021-05-04 15:38:36

solution1
1 2021-05-04 15:37:06

solution2
1 2021-05-04 15:39:23

solution3
1 ACCPTED 2021-05-04 15:43:31

solution4
1 2021-05-04 15:59:21

solution5
0 2021-05-04 15:38:36