简体   繁体   中英

In python, how do I append/add, using a loop, a row to a numpy array without deleting the previous row?

Given a sample of data such as this

3,12.2,3.03,2.32,19,96,1.25,.49,.4,.73,5.5,.66,1.83,510
3,12.77,2.39,2.28,19.5,86,1.39,.51,.48,.64,9.899999,.57,1.63,470
3,14.16,2.51,2.48,20,91,1.68,.7,.44,1.24,9.7,.62,1.71,660
3,13.71,5.65,2.45,20.5,95,1.68,.61,.52,1.06,7.7,.64,1.74,740
3,13.4,3.91,2.48,23,102,1.8,.75,.43,1.41,7.3,.7,1.56,750
3,13.27,4.28,2.26,20,120,1.59,.69,.43,1.35,10.2,.59,1.56,835
3,13.17,2.59,2.37,20,120,1.65,.68,.53,1.46,9.3,.6,1.62,840
3,14.13,4.1,2.74,24.5,96,2.05,.76,.56,1.35,9.2,.61,1.6,560

and my code

import numpy as np
with open("wine.txt","r") as f:
    stuff=f.readlines()
#np.genfromtxt("wine.txt", delimiter=",")
z=np.empty((0,14),float)
for hello in stuff:
    firstbook=hello.strip().split(",")
    x=[float(i) for i in firstbook]
    y=np.array(x)
    b=np.append(b,y)
print b[1:2]

I'm having trouble getting a numpy array that is made out of the entire data set(I'm only getting the last row of the set as the array), such that it would give me the entire column of elements when I print(as in the last line of code). I'm only getting [14.13] when I reach the last line

Why not use np.loadtxt passing the delimiter as comma :

Load data from a text file. Each row in the text file must have the same number of values.

And your data looks good:

import numpy as np

with open("wine.txt","r") as f:
    b = np.loadtxt(f, delimiter=',')
print b[1:2]
# [[3,12.77,2.39,2.28,19.5,86,1.39,.51,.48,.64,9.899999,.57,1.63,470]]

You can use vstack()

import numpy as np

data = '''3,12.2,3.03,2.32,19,96,1.25,.49,.4,.73,5.5,.66,1.83,510
3,12.77,2.39,2.28,19.5,86,1.39,.51,.48,.64,9.899999,.57,1.63,470
3,14.16,2.51,2.48,20,91,1.68,.7,.44,1.24,9.7,.62,1.71,660
3,13.71,5.65,2.45,20.5,95,1.68,.61,.52,1.06,7.7,.64,1.74,740
3,13.4,3.91,2.48,23,102,1.8,.75,.43,1.41,7.3,.7,1.56,750
3,13.27,4.28,2.26,20,120,1.59,.69,.43,1.35,10.2,.59,1.56,835
3,13.17,2.59,2.37,20,120,1.65,.68,.53,1.46,9.3,.6,1.62,840
3,14.13,4.1,2.74,24.5,96,2.05,.76,.56,1.35,9.2,.61,1.6,560'''

stuff = data.split('\n')

z = np.empty((0,14), float)

for hello in stuff:
    firstbook = hello.strip().split(",")
    x = [float(i) for i in firstbook]

    z = np.vstack([z, x])

print(z[1:2])

It is better to accumulate line values in a list, and make an array once.

alist = []
for hello in stuff:
    firstbook=hello.strip().split(",")
    x=[float(i) for i in firstbook]
    alist.append(x)
b = np.array(alist)

Assuming x has the same number of terms for each line, alist will be a list of equal length lists. np.array turns that into a 2d array, just as it does in the prototypical array construction expression:

np.array([[1,2],[3,4]])

Repeated list append is much faster than repeated array stacks/appends.

With your file sample (as a list of lines)

In [1826]: data=np.genfromtxt(txt, dtype=float, delimiter=',')
In [1827]: data
Out[1827]: 
array([[  3.00000000e+00,   1.22000000e+01,   3.03000000e+00,
          2.32000000e+00,   1.90000000e+01,   9.60000000e+01,
          1.25000000e+00,   4.90000000e-01,   4.00000000e-01,
          7.30000000e-01,   5.50000000e+00,   6.60000000e-01,
          1.83000000e+00,   5.10000000e+02],
       [  3.00000000e+00,   1.27700000e+01,   2.39000000e+00,
          ...
          1.35000000e+00,   9.20000000e+00,   6.10000000e-01,
          1.60000000e+00,   5.60000000e+02]])
In [1828]: data.shape
Out[1828]: (8, 14)

2nd column (as 1d array):

In [1829]: data[:,1]
Out[1829]: array([ 12.2 ,  12.77,  14.16,  13.71,  13.4 ,  13.27,  13.17,  14.13])

In [1830]: data[:,1:2]
Out[1830]: 
array([[ 12.2 ],
       [ 12.77],
       [ 14.16],
       [ 13.71],
       [ 13.4 ],
       [ 13.27],
       [ 13.17],
       [ 14.13]])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM