简体   繁体   中英

Reading from text file. Making it to a dictionary

I have a sample.txt file:

1 7 14
3 3 10
1 1 3

What I wrote (it is wrong but at least something)(there were better ones, but I couldn't remake them):

content = []
with open ("sample.txt","r",encoding="utf-8") as file:
    for line in file:
        line = file.readline().strip().split(" ")
        mydict = {"A":int(file.readline().strip().split(" ")[0]),
                  "B":int(file.readline().strip().split(" ")[1]),
                  "C":int(file.readline().strip().split(" ")[2])}
        content.append(mydict)

print (content)

expected output:

[ {"A":1,"B":7,"C":14},{"A":3,"B":3,"C":10},{"A":1,"B":1,"C":3} ]

Neither of them worked sadly. There were 3 problems typically:

  • When I try to convert the values with int()
  • First or last line is missing (problem with indexing)
  • At the values I can't write just simply line[0], because that way it is not working (why?)

I am really newbie, maybe it will be good for someone else who is doing the same as me and noob like me.

You are very close. The line variable is what you need:

content = []
with open ("sample.txt","r",encoding="utf-8") as file:
    for line in file:
        line = line.strip().split(" ")
        mydict = {"A":int(line[0]),
                  "B":int(line[1]),
                  "C":int(line[2])}
        content.append(mydict)

print (content)

there is a builtin csv reader that takes care of this for you very easily

import csv

result = list(csv.DictReader(open("myfile.csv"),["A","B","C"],delimiter=" "))

The thing you are doing wrong is at

line = file.readline().strip().split(" ")

Here you are inside of a for that loops through each line. Check the output of the following

with open("sample.txt") as file:
    for line in file:
        print(line)

This will print each of the lines of the file. So on first iteration it will print 1 7 14 on the second 3 3 10 and on the third 1 1 3 .

Now you are looping through each of the lines But while you are doing that you are reading the next line. Just check this out.

with open("sample.txt") as file:
    count = 0
    for line in file:
        print(count, "This is current line", line)
        print(count, "This is the next line",file.readline())
        count += 1 # count = count + 1

This simply outputs

0 This is current line 1 7 14
0 This is the next line 3 3 10
1 This is current line 1 1 3
1 This is the next line 

So what you are doing is you are going trhough 2 lines at a time and thus you are correctly getting an index error.

Instead what you will have to do is just remove all the readline and read one line at a time. This is almost the same as @quamrana snippet.

content = []
with open ("sample.txt","r",encoding="utf-8") as file:
    for line in file: # Only 1 line at a time
        line = line.strip().split(" ") # .strip() removes all white space in the begging and end and also newlines
        mydict = {"A":int(line[0]),
                  "B":int(line[1]),
                  "C":int(line[2])}
        content.append(mydict)

in case you already using pandas :

import pandas as pd

df = pd.read_csv("sample.txt",sep=' ', header=None)
df.columns = ['A','B','C']
print(df.to_dict('records'))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM