简体   繁体   中英

Read contents of a text file into a Dictionary in Python

I have a text file (img.txt) and data in it is like:

0 0.288281 0.618056 0.080729 0.473148
5 0.229427 0.604167 0.030729 0.039815
0 0.554427 0.024537 0.020313 0.041667
0 0.547135 0.018981 0.020313 0.034259

So I wanted to create a dictionary with the.txt file as key and the all the rows as values. Somewhat like

dict={'img.txt':['class':0, 'x':0.288281, 'y':0.618056, 'height':0.080729, 'width':0.473148 ],
                ['class':5, 'x':0.229427, 'y':0.604167, 'height':0.030729, 'width':0.039815 ]}

Is there a way to add the keys of values( like class,x,y etc ). Also for some reason while reading the file my code is ignoring the class values like( like 0,5 etc). Here is my code:

import os
list_of_files = os.listdir('C:/Users/Lenovo/annotation/')
count =0
my_dict = {}
for file in list_of_files:
    if count < 20:
        with open(file) as f:
            items = [i.strip() for i in f.read().split(" ")]
            my_dict[file.replace(".txt", " ")] = items
    else:
        break
    count = count+1
print(my_dict)

here is my output:

{'img_ano (1) ': ['0', '0.288281', '0.618056', '0.080729', '0.473148\n5', '0.229427', '0.604167', '0.030729', '0.039815\n0', '0.554427', '0.024537', '0.020313', '0.041667\n0', '0.547135', '0.018981', '0.020313', '0.034259\n4', '0.533073', '0.488889', '0.022396', '0.077778\n4', '0.630469', '0.375926', '0.017188', '0.075926\n4', '0.132031', '0.431944', '0.019271', '0.065741\n4', '0.802083', '0.191204', '0.013542', '0.037963\n4', '0.823958', '0.175000', '0.012500', '0.038889\n4', '0.702083', '0.192130', '0.013542', '0.036111'],.......}

You could actually do this reading it as a csv. It's a space-separated-value file. Python offers a very good csv-parsing module ( csv ).

I'm setting the field names and delimiter outside as format definition, which will be static.

As you can see, you can combine a list comprehension and a dict comprehension to, in just a couple of lines and without any intermediate variable, accomplish your desired results.

Then, to process just your '.txt' files, you could use globbing . With python's pathlib, using Path().glob() will return Path objects, which has two advantages:

  • A open() method (equivalent to open(filename))
  • A stem method, that will filter out for you the extension

Finally, you can use csv's DictReader class to directly return a dictionary with the form you want. Just specify fieldnames (wich will be your dict's keys) and a ' ' (space) as a delimiter, that way the csv module will know how to read the file.

For convenience, I've set it into a function you can call with any path and glob you deem neccesary.

import csv
from pathlib import Path
CSVFMT = dict(fieldnames=['class', 'x', 'y', 'height', 'width'], delimiter=' ')


def process_path(path, pattern):
    return {
        fop.stem: [dict(a) for a in csv.DictReader(fop.open(), **CSVFMT)]
        for fop in Path(path).glob(pattern)
    }


process_path('C:/Users/Lenovo/annotation/', '*.txt')

Say you have only have the file img_ano.txt with the following contents in the folder C:/Users/Lenovo/annotation/ :

0 0.288281 0.618056 0.080729 0.473148
5 0.229427 0.604167 0.030729 0.039815
0 0.554427 0.024537 0.020313 0.041667
0 0.547135 0.018981 0.020313 0.034259

You could create a dictionary my_dict with your desired structure using a for loop, collections. ,str. str. ,str. str. , and pathlib.PurePath. pathlib.PurePath. :

import json
import pathlib
from collections import defaultdict

my_dict = defaultdict(list)
for txt_file_path in pathlib.Path("C:/Users/Lenovo/annotation/").glob("*.txt"):
    with open(txt_file_path, "r") as f:
        for line in f:
            class_val, x_val, y_val, height_val, width_val = line.strip().split()
            my_dict[txt_file_path.stem].append({
                "class": int(class_val),
                "x": float(x_val),
                "y": float(y_val),
                "height": float(height_val),
                "width": float(width_val)
            })

print(json.dumps(my_dict, indent=4))

Output:

{
    "img_ano": [
        {
            "class": 0,
            "x": 0.288281,
            "y": 0.618056,
            "height": 0.080729,
            "width": 0.473148
        },
        {
            "class": 5,
            "x": 0.229427,
            "y": 0.604167,
            "height": 0.030729,
            "width": 0.039815
        },
        {
            "class": 0,
            "x": 0.554427,
            "y": 0.024537,
            "height": 0.020313,
            "width": 0.041667
        },
        {
            "class": 0,
            "x": 0.547135,
            "y": 0.018981,
            "height": 0.020313,
            "width": 0.034259
        }
    ]
}

so someone answered and solved my question correctly but the answer is deleted for some reason. So here is the code( i modified only running a loop to add files from a list of text files) from the solution provided:

    import os
    import json
    from collections import defaultdict
    list_of_files = os.listdir('C:/Users/Lenovo/annotation/')
    count =0
    

    my_dict = defaultdict(list)
    for file in list_of_files:
        if count < 20:
            with open(file) as f:
                for line in f:
                    class_val, x_val, y_val, height_val, width_val =                   line.strip().split()
                    my_dict[file].append({"class": class_val,"x": x_val,"y": y_val,"height": height_val,"width": width_val
    })
        
        else:
            break
        count = count+1
    print(json.dumps(my_dict, indent=4))

    
dictt = {}
dictt['img.txt'] = []
for file in list_of_files.split('\n'):
    dictt['img.txt'] =  dictt['img.txt'] + ['class:'+str(file.split(' ')[0]), 'x:'+str(file.split(' ')[1]), 'y:'+str(file.split(' ')[2]), 'height:'+str(file.split(' ')[3]), 'width:'+str(file.split(' ')[4])]

print(dictt)

>>> {'img.txt': ['class:0', 'x:0.288281', 'y:0.618056', 'height:0.080729', 'width:0.473148', 'class:5', 'x:0.229427', 'y:0.604167', 'height:0.030729', 'width:0.039815', 'class:0', 'x:0.554427', 'y:0.024537', 'height:0.020313', 'width:0.041667', 'class:0', 'x:0.547135', 'y:0.018981', 'height:0.020313', 'width:0.034259']}
 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM