使用 for 循环替换 Pandas 中每一行和每一列的单元格值

Question

I'm using python with pandas, but can also import any other library Mu dataset has missing values (NaN) in thousands of rows in each column.我将 python 与 pandas 一起使用，但也可以导入任何其他库 Mu 数据集在每列的数千行中具有缺失值 (NaN)。

Examle例子

**Name,Type,Region...**
Oranges,Fruit,Western Europe  
NaN,NaN,NaN  
NaN,NaN,NaN  
Blueberry, berry,Easter Europe  
NaN,NaN,NaN 
Raspberry, berry,Easter Europe
NaN,NaN,NaN
NaN,NaN,NaN

we can assume that the values in cells that have NaN can be re written to be the same as the previous value, until a new non NaN value is reached.我们可以假设具有 NaN 的单元格中的值可以重写为与先前的值相同，直到达到新的非 NaN 值。 Example:例子：

**Name,Type,Region...**
Oranges,Fruit,Western Europe  
Oranges,Fruit,Western Europe 
Oranges,Fruit,Western Europe 
Blueberry, berry,Easter Europe  
Blueberry, berry,Easter Europe 
Raspberry, berry,Easter Europe  
Raspberry, berry,Easter Europe  
Raspberry, berry,Easter Europe

How can I iterate over each row value and each column to re-write the NaN values to match the first Non NaN value before it?如何遍历每一行值和每一列以重写 NaN 值以匹配它之前的第一个非 NaN 值？

Rules: if cell = NaN and previous_cell = not NaN, replace value with previous_cell, if cell = NaN and previous_cell = NaN, continue (eliminating edge case when the whole column is empty) if cell = NaN, continue规则：如果 cell = NaN 且 previous_cell = not NaN，则将值替换为 previous_cell，如果 cell = NaN 且 previous_cell = NaN，则继续（消除整列为空时的边缘情况）如果 cell = NaN，则继续

I have a huge dataset, so this is not possible to do manually in the CSV file itself我有一个巨大的数据集，所以这不可能在 CSV 文件本身中手动完成

Nested query which does not work嵌套查询不起作用

Answer 1

you can use apply with ffill for all clomuns it avaliable in pandas:您可以将 apply 与 ffill 一起用于 pandas 中可用的所有 clomuns：

df.apply(lambda x: x.fillna(df['Name'].shift())).ffill()
df.apply(lambda x: x.fillna(df['Type'].shift())).ffill()
df.apply(lambda x: x.fillna(df['Region'].shift())).ffill()

使用 for 循环替换 Pandas 中每一行和每一列的单元格值

问题描述

1 个解决方案

解决方案1
0 2022-12-22 15:18:59

使用 for 循环替换 Pandas 中每一行和每一列的单元格值

问题描述

1 个解决方案

解决方案1 0 2022-12-22 15:18:59

解决方案1
0 2022-12-22 15:18:59