[英]How to replace NaN value in one column based on the value of another column in the same row using Pandas?
There is a dataset of vehicles by type (sedan, SUV, truck, etc), odometer, cylinders, price, etc. I am addressing the missing values in the column 'cylinders', which contains the number of cylinders in the engine of the vehicle.有一个按类型(轿车、SUV、卡车等)、里程表、气缸、价格等分类的车辆数据集。我正在解决“气缸”列中的缺失值,其中包含发动机中的气缸数车辆。 My approach to fill in the missing values is to use the median number of cylinders per type of vehicle.我填写缺失值的方法是使用每种车辆类型的气缸中位数。 Using a pivot table it looks like this: Screenshot of the pivot table使用 pivot 表,它看起来像这样: pivot 表的屏幕截图
Now I want to create a for loop that goes through every row and when it finds a NaN value in column 'cylinders' replaces it with the median value seen in the pivot table according to the type.现在我想创建一个遍历每一行的 for 循环,当它在“圆柱”列中找到 NaN 值时,根据类型将其替换为 pivot 表中的中值。
Thanks谢谢
So there you have a for loop that goes through every row in your cars dataframe and when it finds a NaN value its gonna look in your pivot_table and will replace the NaN with the Cylinders value of that particular car type.因此,您有一个 for 循环遍历汽车 dataframe 中的每一行,当它找到 NaN 值时,它将查看您的 pivot_table 并将 NaN 替换为该特定汽车类型的 Cylinders 值。
for index, row in cars_table.iterrows():
if pd.isnull(row['Cylinders']):
pivot_table_index = pivot_table.index.get_loc(row['Type'])
cars_table.loc[index, 'Cylinders'] = pivot_table['Cylinders'][pivot_table_index]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.