简体   繁体   中英

Reading in data from a text file and storing it in an array in python

I'm trying to read data from a text file line by line and store it in a 2D array so that I can process it further at a later stage.

Every time the string 'EOE' is found I would like to move over to a new row and continue reading in entries line by line from the text file.

I can't seem to be able to declare a 2D string array or read in the values sucessfully. I'm new to python coming from C so my syntax and general python understanding isn't great.

rf = open('data_small.txt', 'r')
lines = rf.readlines()
rf.close()
i = 0
j = 0

line_array = np.array((200, 200))

for line in lines:
    line=line.strip()
    print(line)
    line_array[i][j] = line
    if line == 'EOE':
        i+=1
    j+=1

rf.close()

line_array

The text file looks something like this:

 ----- Entry1=50 Entry2=SomeText Entry3=Instance.Test.ID=67 EOE ----- Entry1=Processing Entry2=50.87.78 Entry3=Instance.Test.ID=91 EOE ----- Entry1=50 Entry2=SomeText Entry3=Instance.Test.ID=67 EOE -----

and I would like the array string array to look something like this, the rows and columns can be transposed but the overall idea is that either one row or one column represents an EOE entry:

array = [
['-----', 'Entry1=50', 'Entry2=SomeText', 'Entry3=Instance.Test.ID=67', 'EOE'],
['-----', 'Entry1=Processing', 'Entry2=50.87.78', 'Entry3=Instance.Test.ID=91', 'EOE'],
['-----', 'Entry1=50', 'Entry2=SomeText', 'Entry3=Instance.Test.ID=67', 'EOE']
]

This is one approach.

Ex:

res = [[]]
with open(filename) as infile:
    for line in infile:            #Iterate each line
        line = line.strip()        #strip new line
        if line == 'EOE':          #check for `EOE`
            res.append([])         #Add new sub-list
        else:
            res[-1].append(line)   #Append content to previous sub-list

print(res)

Output:

[['-----', 'Entry1=50', 'Entry2=SomeText', 'Entry3=Instance.Test.ID=67'],
 ['-----',
  'Entry1=Processing',
  'Entry2=50.87.78',
  'Entry3=Instance.Test.ID=91'],
 ['-----', 'Entry1=50', 'Entry2=SomeText', 'Entry3=Instance.Test.ID=67'],
 ['-----']]

Here is a "pythonic" approach:

>>> with open('data_small.txt') as input_file:
>>>    contents = input_file.read()

>>> contents

'-----\nEntry1=50\nEntry2=SomeText\nEntry3=Instance.Test.ID=67\nEOE\n-----\nEntry1=Processing\nEntry2=50.87.78\nEntry3=Instance.Test.ID=91\nEOE\n-----\nEntry1=50\nEntry2=SomeText\nEntry3=Instance.Test.ID=67\nEOE\n-----'

First step is to split by \\nEOE\\n :

>>> contents = contents.split('\nEOE\n')
>>> contents

['-----\nEntry1=50\nEntry2=SomeText\nEntry3=Instance.Test.ID=67',
 '-----\nEntry1=Processing\nEntry2=50.87.78\nEntry3=Instance.Test.ID=91',
 '-----\nEntry1=50\nEntry2=SomeText\nEntry3=Instance.Test.ID=67',
 '-----']

Next is to split each element in list by \\n :

>>> contents = [content.split('\n') for content in contents]
>>> contents

[['-----', 'Entry1=50', 'Entry2=SomeText', 'Entry3=Instance.Test.ID=67'],
 ['-----',
  'Entry1=Processing',
  'Entry2=50.87.78',
  'Entry3=Instance.Test.ID=91'],
 ['-----', 'Entry1=50', 'Entry2=SomeText', 'Entry3=Instance.Test.ID=67'],
 ['-----']]

This gives you your desired output. If you don't want the last element, just do:

>>> contents = contents[:-1]
>>> contents

[['-----', 'Entry1=50', 'Entry2=SomeText', 'Entry3=Instance.Test.ID=67'],
 ['-----',
  'Entry1=Processing',
  'Entry2=50.87.78',
  'Entry3=Instance.Test.ID=91'],
 ['-----', 'Entry1=50', 'Entry2=SomeText', 'Entry3=Instance.Test.ID=67']]

PS: Make sure you use with statement only to open and read a file, and then do your computations outside the with statement.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM