如何使用numpy從附加的多維數組中刪除“無”

Question

我需要獲取一個csv文件並將此數據導入到python中的多維數組中，但我不確定在將數據附加到空數組后如何從數組中刪除'None'值。

我首先創建了一個這樣的結構：

storecoeffs = numpy.empty((5,11), dtype='object')

這將返回由'None'填充的5行×11列數組。

接下來，我打開了我的csv文件並將其轉換為數組：

coeffsarray = list(csv.reader(open("file.csv")))

coeffsarray = numpy.array(coeffsarray, dtype='object')

然后，我附加了兩個數組：

newmatrix = numpy.append(storecoeffs, coeffsarray, axis=1)

結果是一個數組填充'None'值后跟我想要的數據（顯示前兩行，讓您了解我的數據的性質）：

array([[None, None, None, None, None, None, None, None, None, None, None,
    workers, constant, hhsize, inc1, inc2, inc3, inc4, age1, age2,
    age3, age4],[None, None, None, None, None, None, None, None, None, None, None,
    w0, 7.334, -1.406, 2.823, 2.025, 0.5145, 0, -4.936, -5.054, -2.8, 0],,...]], dtype=object)

如何從每行中刪除那些“無”對象，所以我剩下的是帶有我的數據的5 x11多維數組？

Answer 1

你為什么要分配一個完整的None數組並附加到那個？ coeffsarray不是你想要的數組嗎？

編輯

哦。 使用numpy.reshape 。

import numpy
coeffsarray = numpy.reshape( coeffsarray, ( 5, 11 ) )

Answer 2

從一個空數組開始？

storecoeffs = numpy.empty((5,0), dtype='object')

Answer 3

為什么不簡單地使用numpy.loadtxt（）：

newmatrix = numpy.loadtxt("file.csv", dtype='object')

應該做的工作，如果我理解你的問題。

Answer 4

@Gnibbler的答案在技術上是正確的，但是沒有理由首先創建初始的storecoeffs數組。 只需加載您的值，然后從中創建一個數組。 正如@Mermoz所指出的那樣，你的用例對於numpy.loadtxt（）看起來很簡單。

除此之外，你為什么要使用對象數組？ 它可能不是你想要的......現在，你將數值存儲為字符串，而不是浮點數！

您基本上有兩種方法來處理numpy中的數據。 如果要輕松訪問命名列，請使用結構化數組（或記錄數組）。 如果你想要一個“普通的”多維數組，只需使用一個浮點數組，整數組等。對象數組有一個特定的目的，但它可能不是你正在做的。

例如：要將數據作為普通的2D numpy數組加載（假設您的所有數據都可以很容易地表示為float）：

import numpy as np
# Note that this ignores your column names, and attempts to 
# convert all values to a float...
data = np.loadtxt('input_filename.txt', delimiter=',', skiprows=1)

# Access the first column 
workers = data[:,0]

要將數據作為結構化數組加載，您可能會執行以下操作：

import numpy as np
infile = file('input_filename.txt')

# Read in the names of the columns from the first row...
names = infile.next().strip().split()

# Make a dtype from these names...
dtype = {'names':names, 'formats':len(names)*[np.float]}

# Read the data in...
data = np.loadtxt(infile, dtype=dtype, delimiter=',')

# Note that data is now effectively 1-dimensional. To access a column,
# index it by name
workers = data['workers']

# Note that this is now one-dimensional... You can't treat it like a 2D array
data[1:10, 3:5] # <-- Raises an error!

data[1:10][['inc1', 'inc2']] # <-- Effectively the same thing, but works..

如果數據中包含非數值並希望將它們作為字符串處理，則需要使用結構化數組，指定要作為字符串的字段，並在字段中設置字符串的最大長度。

從您的示例數據看，它看起來像第一列，“workers”是一個非數字值，您可能希望將其存儲為字符串，其余所有類似於浮點數。 在這種情況下，你會做這樣的事情：

import numpy as np
infile = file('input_filename.txt')
names = infile.next().strip().split()

# Create the dtype... The 'S10' indicates a string field with a length of 10
dtype = {'names':names, 'formats':['S10'] + (len(names) - 1)*[np.float]}
data = np.loadtxt(infile, dtype=dtype, delimiter=',')

# The "workers" field is now a string array
print data['workers']

# Compare this to the other fields
print data['constant']

如果您確實需要csv模塊的靈活性（例如帶逗號的文本字段），您可以使用它來讀取數據，然后將其轉換為具有相應dtype的結構化數組。

希望能讓事情變得更加清晰......

如何使用numpy從附加的多維數組中刪除“無”

問題描述

4 個解決方案

解決方案1
1 2010-08-06 19:31:02

編輯

解決方案2
1 2010-08-06 19:31:03

解決方案3
1 2010-08-06 19:47:07

解決方案4
1 已采納 2010-08-06 20:10:31

如何使用numpy從附加的多維數組中刪除“無”

問題描述

4 個解決方案

解決方案1 1 2010-08-06 19:31:02

編輯

解決方案2 1 2010-08-06 19:31:03

解決方案3 1 2010-08-06 19:47:07

解決方案4 1 已采納 2010-08-06 20:10:31

解決方案1
1 2010-08-06 19:31:02

解決方案2
1 2010-08-06 19:31:03

解決方案3
1 2010-08-06 19:47:07

解決方案4
1 已采納 2010-08-06 20:10:31