![](/img/trans.png)
[英]How to read specific column under a string from csv file using python?
[英]How to read a specific column of csv file using python
我是Scikit-Learn的新手,我想将已经标记的数据集转换为数据集。 我已将数据的.csv文件转换为NumPy数组,但是遇到的一个问题是根据第二列中标志的存在将数据分类为训练集。 我想知道如何使用Pandas Utility Module访问.csv文件的特定行,列。 以下是我的代码:
import numpy as np
import pandas as pd
import csv
import nltk
import pickle
from nltk.classify.scikitlearn import SklearnClassifier
from sklearn.naive_bayes import MultinomialNB,BernoulliNB
from nltk.classify import ClassifierI
from statistics import mode
def numpyfy(fileid):
data = pd.read_csv(fileid,encoding = 'latin1')
#pd.readline(data)
target = data["String"]
data1 = data.ix[1:,:-1]
#print(data)
return data1
def learn(fileid):
trainingsetpos = []
trainingsetneg = []
datanew = numpyfy(fileid)
if(datanew.ix['Status']==1):
trainingsetpos.append(datanew.ix['String'])
if(datanew.ix['Status']==0):
trainingsetneg.append(datanew.ix['String'])
print(list(trainingsetpos))
您可以使用布尔索引来拆分数据。 就像是
import pandas as pd
def numpyfy(fileid):
df = pd.read_csv(fileid, encoding='latin1')
target = df.pop('String')
data = df.ix[1:,:-1]
return target, data
def learn(fileid):
target, data = numpyfy(fileid)
trainingsetpos = data[data['Status'] == 1]
trainingsetneg = data[data['Status'] == 0]
print(trainingsetpos)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.