简体   繁体   English

从文本文件中读取。 使它成为字典

[英]Reading from text file. Making it to a dictionary

I have a sample.txt file:我有一个 sample.txt 文件:

1 7 14
3 3 10
1 1 3

What I wrote (it is wrong but at least something)(there were better ones, but I couldn't remake them):我写的(这是错误的,但至少有一些东西)(有更好的,但我无法重新制作它们):

content = []
with open ("sample.txt","r",encoding="utf-8") as file:
    for line in file:
        line = file.readline().strip().split(" ")
        mydict = {"A":int(file.readline().strip().split(" ")[0]),
                  "B":int(file.readline().strip().split(" ")[1]),
                  "C":int(file.readline().strip().split(" ")[2])}
        content.append(mydict)

print (content)

expected output:预期 output:

[ {"A":1,"B":7,"C":14},{"A":3,"B":3,"C":10},{"A":1,"B":1,"C":3} ]

Neither of them worked sadly.他们都没有悲伤地工作。 There were 3 problems typically:通常有3个问题:

  • When I try to convert the values with int()当我尝试使用 int() 转换值时
  • First or last line is missing (problem with indexing)缺少第一行或最后一行(索引问题)
  • At the values I can't write just simply line[0], because that way it is not working (why?)在值上我不能只写行 [0],因为那样它不起作用(为什么?)

I am really newbie, maybe it will be good for someone else who is doing the same as me and noob like me.我真的是新手,也许对其他和我一样的人和像我一样的菜鸟来说会很好。

You are very close.你很亲密。 The line variable is what you need: line 变量是您需要的:

content = []
with open ("sample.txt","r",encoding="utf-8") as file:
    for line in file:
        line = line.strip().split(" ")
        mydict = {"A":int(line[0]),
                  "B":int(line[1]),
                  "C":int(line[2])}
        content.append(mydict)

print (content)

there is a builtin csv reader that takes care of this for you very easily有一个内置的 csv 阅读器可以很容易地为您处理这个问题

import csv

result = list(csv.DictReader(open("myfile.csv"),["A","B","C"],delimiter=" "))

The thing you are doing wrong is at你做错的事情是

line = file.readline().strip().split(" ")

Here you are inside of a for that loops through each line.在这里,您位于循环每一行的for中。 Check the output of the following检查以下output

with open("sample.txt") as file:
    for line in file:
        print(line)

This will print each of the lines of the file.这将打印文件的每一行。 So on first iteration it will print 1 7 14 on the second 3 3 10 and on the third 1 1 3 .因此,在第一次迭代中,它将在第二个3 3 10和第三个1 1 3上打印1 7 14

Now you are looping through each of the lines But while you are doing that you are reading the next line.现在您正在遍历每一行但是当您这样做时,您正在阅读下一行。 Just check this out.看看这个。

with open("sample.txt") as file:
    count = 0
    for line in file:
        print(count, "This is current line", line)
        print(count, "This is the next line",file.readline())
        count += 1 # count = count + 1

This simply outputs这只是输出

0 This is current line 1 7 14
0 This is the next line 3 3 10
1 This is current line 1 1 3
1 This is the next line 

So what you are doing is you are going trhough 2 lines at a time and thus you are correctly getting an index error.因此,您正在做的是一次通过 2 行,因此您正确地收到了索引错误。

Instead what you will have to do is just remove all the readline and read one line at a time.相反,您只需删除所有readline并一次读取一行。 This is almost the same as @quamrana snippet.这与@quamrana 片段几乎相同。

content = []
with open ("sample.txt","r",encoding="utf-8") as file:
    for line in file: # Only 1 line at a time
        line = line.strip().split(" ") # .strip() removes all white space in the begging and end and also newlines
        mydict = {"A":int(line[0]),
                  "B":int(line[1]),
                  "C":int(line[2])}
        content.append(mydict)

in case you already using pandas :如果您已经使用pandas

import pandas as pd

df = pd.read_csv("sample.txt",sep=' ', header=None)
df.columns = ['A','B','C']
print(df.to_dict('records'))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM