简体   繁体   English

将数据读取到numpy数组

[英]read data to numpy array

I have a file below 我有一个文件在下面

label,feature
0,70 80 90 50 33 58 ...
2,53 56 84 56 25 12 ...
1,32 56 84 89 65 87 ...
...
2,56 48 57 56 99 22 ...
4,25 65 84 54 54 15 ...

I want the data could be 我希望数据可以

Ytrain = [0,2,1,...2,4]  (int, ndarray)
Xtrain = [[70 80 90 50 33 58...],
          [53 56 80 56 25 12...],
          ...
          [25 65 84 54 54 15...]] (int, ndarray)

here is my code 这是我的代码

data = pd.read_csv('train.csv')
Ytrain = np.array(data.iloc[:, 0]).astype(int)
train = np.array(data.iloc[:, 1:]).astype(str)

Xtrain = []
for i in range(len(train)):
    tmp = [int(x) for x in train[i][0].split()]
    Xtrain.append(tmp)
Xtrain = np.array(Xtrain)

do you have a better way to do that ? 你有更好的方法吗?

Add multiple separator to read_csv with header=None and skiprows=1 for not read csv header: 使用header=None将多个分隔符添加到read_csv ,并且对于未读取csv头,将skiprows=1

data = pd.read_csv('train.csv', sep="[,\s+]", header=None, skiprows=1, engine='python')
print (data)
   0   1   2   3   4   5   6
0  0  70  80  90  50  33  58
1  2  53  56  84  56  25  12
2  1  32  56  84  89  65  87
3  2  56  48  57  56  99  22
4  4  25  65  84  54  54  15

Last select by iloc : 最后由iloc选择:

Ytrain = data.iloc[:,0].values
Xtrain = data.iloc[:,1:].values

Or use split with expand=True for DataFrame : 或者使用splitexpand=TrueDataFrame

data = pd.read_csv('train.csv')
Ytrain = data.iloc[:,0].values.astype(int)
Xtrain = data.iloc[:,1].str.split(expand=True).values.astype(int)

print (Ytrain)
[0 2 1 2 4]

print (Xtrain)
[[70 80 90 50 33 58]
 [53 56 84 56 25 12]
 [32 56 84 89 65 87]
 [56 48 57 56 99 22]
 [25 65 84 54 54 15]]

You can use numpy for this. 你可以使用numpy Since you have multiple delimiters, a little more work is required. 由于您有多个分隔符,因此需要做更多的工作。

import numpy as np

s = open('train.csv', 'r').read().replace(',', ' ')
arr = np.genfromtxt(s)

Ytrain = arr[:, 1]
Xtrain = arr[:, 1:]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM