简体   繁体   English

将文本文件的内容读入 Python 中的字典

[英]Read contents of a text file into a Dictionary in Python

I have a text file (img.txt) and data in it is like:我有一个文本文件(img.txt),其中的数据如下:

0 0.288281 0.618056 0.080729 0.473148
5 0.229427 0.604167 0.030729 0.039815
0 0.554427 0.024537 0.020313 0.041667
0 0.547135 0.018981 0.020313 0.034259

So I wanted to create a dictionary with the.txt file as key and the all the rows as values.所以我想创建一个字典,以 .txt 文件为键,所有行为值。 Somewhat like有点像

dict={'img.txt':['class':0, 'x':0.288281, 'y':0.618056, 'height':0.080729, 'width':0.473148 ],
                ['class':5, 'x':0.229427, 'y':0.604167, 'height':0.030729, 'width':0.039815 ]}

Is there a way to add the keys of values( like class,x,y etc ).有没有办法添加值的键(如 class,x,y 等)。 Also for some reason while reading the file my code is ignoring the class values like( like 0,5 etc).同样由于某种原因,在读取文件时,我的代码忽略了 class 值,例如(例如 0,5 等)。 Here is my code:这是我的代码:

import os
list_of_files = os.listdir('C:/Users/Lenovo/annotation/')
count =0
my_dict = {}
for file in list_of_files:
    if count < 20:
        with open(file) as f:
            items = [i.strip() for i in f.read().split(" ")]
            my_dict[file.replace(".txt", " ")] = items
    else:
        break
    count = count+1
print(my_dict)

here is my output:这是我的 output:

{'img_ano (1) ': ['0', '0.288281', '0.618056', '0.080729', '0.473148\n5', '0.229427', '0.604167', '0.030729', '0.039815\n0', '0.554427', '0.024537', '0.020313', '0.041667\n0', '0.547135', '0.018981', '0.020313', '0.034259\n4', '0.533073', '0.488889', '0.022396', '0.077778\n4', '0.630469', '0.375926', '0.017188', '0.075926\n4', '0.132031', '0.431944', '0.019271', '0.065741\n4', '0.802083', '0.191204', '0.013542', '0.037963\n4', '0.823958', '0.175000', '0.012500', '0.038889\n4', '0.702083', '0.192130', '0.013542', '0.036111'],.......}

You could actually do this reading it as a csv.实际上,您可以将其读取为 csv。 It's a space-separated-value file.这是一个空格分隔值文件。 Python offers a very good csv-parsing module ( csv ). Python 提供了一个非常好的 csv 解析模块( csv )。

I'm setting the field names and delimiter outside as format definition, which will be static.我将字段名称和分隔符设置为格式定义,即 static。

As you can see, you can combine a list comprehension and a dict comprehension to, in just a couple of lines and without any intermediate variable, accomplish your desired results.如您所见,您可以将列表推导和字典推导结合起来,只需几行,无需任何中间变量,即可完成您想要的结果。

Then, to process just your '.txt' files, you could use globbing .然后,要仅处理您的“.txt”文件,您可以使用globbing With python's pathlib, using Path().glob() will return Path objects, which has two advantages:使用python的pathlib,使用Path().glob()会返回Path对象,有两个好处:

  • A open() method (equivalent to open(filename))一个 open() 方法(相当于 open(filename))
  • A stem method, that will filter out for you the extension一种干方法,它将为您过滤掉扩展名

Finally, you can use csv's DictReader class to directly return a dictionary with the form you want.最后,您可以使用csv的DictReader class 直接返回您想要的形式的字典。 Just specify fieldnames (wich will be your dict's keys) and a ' ' (space) as a delimiter, that way the csv module will know how to read the file.只需指定字段名(这将是您的字典的键)和“”(空格)作为分隔符,这样csv模块将知道如何读取文件。

For convenience, I've set it into a function you can call with any path and glob you deem neccesary.为方便起见,我将其设置为 function 您可以使用任何您认为必要的路径和 glob 调用。

import csv
from pathlib import Path
CSVFMT = dict(fieldnames=['class', 'x', 'y', 'height', 'width'], delimiter=' ')


def process_path(path, pattern):
    return {
        fop.stem: [dict(a) for a in csv.DictReader(fop.open(), **CSVFMT)]
        for fop in Path(path).glob(pattern)
    }


process_path('C:/Users/Lenovo/annotation/', '*.txt')

Say you have only have the file img_ano.txt with the following contents in the folder C:/Users/Lenovo/annotation/ :假设您在文件夹C:/Users/Lenovo/annotation/中只有包含以下内容的文件img_ano.txt

0 0.288281 0.618056 0.080729 0.473148
5 0.229427 0.604167 0.030729 0.039815
0 0.554427 0.024537 0.020313 0.041667
0 0.547135 0.018981 0.020313 0.034259

You could create a dictionary my_dict with your desired structure using a for loop, collections. defaultdict您可以使用 for 循环 collections 创建具有所需结构的字典my_dict collections. defaultdict collections. defaultdict ,str. strip collections. defaultdictstr. strip str. strip ,str. split str. stripstr. split str. split , and pathlib.PurePath. stem str. splitpathlib.PurePath. stem pathlib.PurePath. stem : pathlib.PurePath. stem

import json
import pathlib
from collections import defaultdict

my_dict = defaultdict(list)
for txt_file_path in pathlib.Path("C:/Users/Lenovo/annotation/").glob("*.txt"):
    with open(txt_file_path, "r") as f:
        for line in f:
            class_val, x_val, y_val, height_val, width_val = line.strip().split()
            my_dict[txt_file_path.stem].append({
                "class": int(class_val),
                "x": float(x_val),
                "y": float(y_val),
                "height": float(height_val),
                "width": float(width_val)
            })

print(json.dumps(my_dict, indent=4))

Output: Output:

{
    "img_ano": [
        {
            "class": 0,
            "x": 0.288281,
            "y": 0.618056,
            "height": 0.080729,
            "width": 0.473148
        },
        {
            "class": 5,
            "x": 0.229427,
            "y": 0.604167,
            "height": 0.030729,
            "width": 0.039815
        },
        {
            "class": 0,
            "x": 0.554427,
            "y": 0.024537,
            "height": 0.020313,
            "width": 0.041667
        },
        {
            "class": 0,
            "x": 0.547135,
            "y": 0.018981,
            "height": 0.020313,
            "width": 0.034259
        }
    ]
}

so someone answered and solved my question correctly but the answer is deleted for some reason.所以有人正确回答并解决了我的问题,但由于某种原因,答案被删除了。 So here is the code( i modified only running a loop to add files from a list of text files) from the solution provided:所以这里是提供的解决方案中的代码(我只修改了运行循环以从文本文件列表中添加文件):

    import os
    import json
    from collections import defaultdict
    list_of_files = os.listdir('C:/Users/Lenovo/annotation/')
    count =0
    

    my_dict = defaultdict(list)
    for file in list_of_files:
        if count < 20:
            with open(file) as f:
                for line in f:
                    class_val, x_val, y_val, height_val, width_val =                   line.strip().split()
                    my_dict[file].append({"class": class_val,"x": x_val,"y": y_val,"height": height_val,"width": width_val
    })
        
        else:
            break
        count = count+1
    print(json.dumps(my_dict, indent=4))

    
dictt = {}
dictt['img.txt'] = []
for file in list_of_files.split('\n'):
    dictt['img.txt'] =  dictt['img.txt'] + ['class:'+str(file.split(' ')[0]), 'x:'+str(file.split(' ')[1]), 'y:'+str(file.split(' ')[2]), 'height:'+str(file.split(' ')[3]), 'width:'+str(file.split(' ')[4])]

print(dictt)

>>> {'img.txt': ['class:0', 'x:0.288281', 'y:0.618056', 'height:0.080729', 'width:0.473148', 'class:5', 'x:0.229427', 'y:0.604167', 'height:0.030729', 'width:0.039815', 'class:0', 'x:0.554427', 'y:0.024537', 'height:0.020313', 'width:0.041667', 'class:0', 'x:0.547135', 'y:0.018981', 'height:0.020313', 'width:0.034259']}
 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM