[英]How can I drop rows with certain values from a dataframe?
I'm taking two different datasets and merging them into a single data frame, but I need to take one of the columns ('Presunto Responsable') of the resulting data frame and remove the rows with the value 'Desconocido' in it.我正在获取两个不同的数据集并将它们合并到一个数据框中,但我需要获取结果数据框中的一列(“Presunto Responsable”)并删除其中包含值“Desconocido”的行。
This is my code so far:到目前为止,这是我的代码:
#%% Get data
def getData(path_A, path_B):
victims = pd.read_excel(path_A)
dfv = pd.DataFrame(data=victims)
cases = pd.read_excel(path_B)
dfc = pd.DataFrame(data=cases)
return dfv, dfc
#%% merge dataframes
def mergeData(data_A, data_B):
data = pd.DataFrame()
#merge dataframe avoiding duplicated colums
cols_to_use = data_B.columns.difference(data_A.columns)
data = pd.merge(data_A, data_B[cols_to_use], left_index=True, right_index=True, how='outer')
cols_at_end = ['Presunto Responsable']
#Take 'Presunto Responsable' at the end of the dataframe
data = data[[c for c in data if c not in cols_at_end]
+ [c for c in cols_at_end if c in data]]
return data
#%% Drop 'Desconocido' values in 'Presunto Responsable'
def dropData(data):
indexNames = data[data['Presunto Responsable'] == 'Desconocido'].index
for c in indexNames:
data.drop(indexNames , inplace=True)
return data
The resulting dataframe still has the rows with 'Desconocido' values in them.生成的 dataframe 中仍然包含具有“Desconocido”值的行。 What am I doing wrong?
我究竟做错了什么?
You can just say:你可以说:
data = data[data['Presunto Responsable'] != 'Desconocido']
Also, btw, when you do pd.read_excel()
it creates a dataframe, you don't need to then pass that into pd.DataFrame()
.另外,顺便说一句,当您执行
pd.read_excel()
时,它会创建一个 dataframe,您无需将其传递给pd.DataFrame()
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.