[英]Selecting rows based on certain column values returns empty dataframe
我想根据某个列变量的不同值从 dataframe 中提取 select 行并制作直方图。
import numpy as np
import pandas as pd
import csv
import matplotlib.pyplot as plt
df_train=pd.read_csv(r'C:\users\visha\downloads\1994_census\adult.data')
df_train.columns = ["age", "workclass", "fnlwgt", "education",
"educationnum", "maritalstatus", "occupation",
"relationship", "race", "sex", "capitalgain",
"capitalloss", "hoursperweek", "nativecountry",
"incomelevel"]
df_train.dropna(how='any')
df_train.loc[(df_train!=0).any(axis=1)]
#df_train.incomelevel = pd.to_numeric(df_train.incomelevel, errors =
'coerce').fillna(0).astype('Int64')
df_train.drop(columns='fnlwgt', inplace = True)
#df_test=pd.read_csv(r'C:\users\visha\downloads\1994_census\adult.test')
#df_train.boxplot(column = 'age', by = 'incomelevel', grid = False)
df_train.loc[df_train['incomelevel'] == '<=50K']
#df_train.loc[df_train['incomelevel'] == '>50K']
Output:空 DataFrame 列:[年龄,工种,fnlwgt,教育,教育,婚姻状况,职业,关系,种族,性别,资本收益,资本损失,每周工作时间,本国,收入水平]指数:[]
从以上几行您可以得出我正在尝试 select 收入水平为“<=50K”的行。 “incomelevel”列是 object 数据类型。 但是当我尝试打印它时,它只返回所有列名并提到 dataframe 为“空”。 或者当我在没有打印 function 的情况下在 jupyter 笔记本中运行它时,它只显示 dataframe 以及所有列名,除了这些列下没有任何内容。
您应该使用skipinitialspace=True
调用 csv 因为每个值的前面都有空格,然后它可以工作:
df = pd.read_csv('adult.data', header=None, skipinitialspace=True)
df.columns = ["age", "workclass", "fnlwgt", "education",
"educationnum", "maritalstatus", "occupation",
"relationship", "race", "sex", "capitalgain",
"capitalloss", "hoursperweek", "nativecountry",
"incomelevel"]
df = df[df['incomelevel']=='<=50K']
print(df.head())
age workclass fnlwgt education educationnum maritalstatus ... sex capitalgain capitalloss hoursperweek nativecountry incomelevel
0 39 State-gov 77516 Bachelors 13 Never-married ... Male 2174 0 40 United-States <=50K
1 50 Self-emp-not-inc 83311 Bachelors 13 Married-civ-spouse ... Male 0 0 13 United-States <=50K
2 38 Private 215646 HS-grad 9 Divorced ... Male 0 0 40 United-States <=50K
3 53 Private 234721 11th 7 Married-civ-spouse ... Male 0 0 40 United-States <=50K
4 28 Private 338409 Bachelors 13 Married-civ-spouse ... Female 0 0 40 Cuba <=50K
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.