简体   繁体   中英

Reading a text file into a matrix in Python

I have text file variable in python containing:

A 
-1 2 -3
4 5 6

B 
4 5 6
3 23 5

How do I numpy matrices from this text file in a compact way? I have solved it but it's an ugly and long solution..

idx_A = read_data.find('A')
matrix = [item.split() for item in read_data[idx_A:(idx_B+1)].split('\n')[:-1]]
            A = np.array(list(map(float, matrix[1])))
            for i in range(2,len(matrix)-1): 
                A = np.vstack([A,list(map(float, matrix[i]))])

And so forth..

AFAIK, Python doesn't have a human-readable, flat file, serialisation format for multiple variables. In the future you should consider the npz format and the savez function to maintain human-readableness. Or if you can give up human-readableness, then check out pickle .

So to recover the data in the format you have, you'll have to do a bit of manual file reading. This is what I came up with for a first pass attempt, which I don't think is too messy:

from io import StringIO
import numpy as np

stateName, stateData = range(2)

state = stateName

allData = {}

with open('data') as fp:
    for line in fp:
        #print(line.strip())
        if state == stateName:
            currentName = line.strip()
            currentData = ""
            state = stateData
        else: # stateData
            if(line.strip()): # there some data on this line
                currentData += line
            else: #no data, so process what we have
                dataAsFile = StringIO(currentData)
                allData[currentName] = np.loadtxt(dataAsFile)
                state = stateName

#Process last variable
dataAsFile = StringIO(currentData)
allData[currentName] = np.loadtxt(dataAsFile)

Running it with the data from your question in a file called 'data', I get this:

>>> allData
{'B': array([[  4.,   5.,   6.],
       [  3.,  23.,   5.]]), 'A': array([[-1.,  2., -3.],
       [ 4.,  5.,  6.]])}

Use numpy to read csv files into arrays:

import numpy as np
csv = np.genfromtxt('file.csv')

Note from numpy.genfromtxt documentation that delimiter is by default None or any whitespace.

One way to parse the specific format you have is to use these optional parameters:

skip_header : int, optional

The number of lines to skip at the beginning of the file.

skip_footer : int, optional

The number of lines to skip at the end of the file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM