简体   繁体   English

使用 for 循环替换 Pandas 中每一行和每一列的单元格值

[英]Replace cell values for each row and each column in Pandas using for loop

I'm using python with pandas, but can also import any other library Mu dataset has missing values (NaN) in thousands of rows in each column.我将 python 与 pandas 一起使用,但也可以导入任何其他库 Mu 数据集在每列的数千行中具有缺失值 (NaN)。

Examle例子

**Name,Type,Region...**
Oranges,Fruit,Western Europe  
NaN,NaN,NaN  
NaN,NaN,NaN  
Blueberry, berry,Easter Europe  
NaN,NaN,NaN 
Raspberry, berry,Easter Europe
NaN,NaN,NaN
NaN,NaN,NaN 

we can assume that the values in cells that have NaN can be re written to be the same as the previous value, until a new non NaN value is reached.我们可以假设具有 NaN 的单元格中的值可以重写为与先前的值相同,直到达到新的非 NaN 值。 Example:例子:

**Name,Type,Region...**
Oranges,Fruit,Western Europe  
Oranges,Fruit,Western Europe 
Oranges,Fruit,Western Europe 
Blueberry, berry,Easter Europe  
Blueberry, berry,Easter Europe 
Raspberry, berry,Easter Europe  
Raspberry, berry,Easter Europe  
Raspberry, berry,Easter Europe

How can I iterate over each row value and each column to re-write the NaN values to match the first Non NaN value before it?如何遍历每一行值和每一列以重写 NaN 值以匹配它之前的第一个非 NaN 值?

Rules: if cell = NaN and previous_cell = not NaN, replace value with previous_cell, if cell = NaN and previous_cell = NaN, continue (eliminating edge case when the whole column is empty) if cell = NaN, continue规则:如果 cell = NaN 且 previous_cell = not NaN,则将值替换为 previous_cell,如果 cell = NaN 且 previous_cell = NaN,则继续(消除整列为空时的边缘情况)如果 cell = NaN,则继续

I have a huge dataset, so this is not possible to do manually in the CSV file itself我有一个巨大的数据集,所以这不可能在 CSV 文件本身中手动完成

Nested query which does not work嵌套查询不起作用

you can use apply with ffill for all clomuns it avaliable in pandas:您可以将 apply 与 ffill 一起用于 pandas 中可用的所有 clomuns:

df.apply(lambda x: x.fillna(df['Name'].shift())).ffill()
df.apply(lambda x: x.fillna(df['Type'].shift())).ffill()
df.apply(lambda x: x.fillna(df['Region'].shift())).ffill()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 for 循环替换 pandas 列的每一行中的单元格值 - Replace cell values in each row of pandas column using for loop 使用binarizer和for循环替换pandas列每一行中的单元格值 - Replace cell values in each row of pandas column using for binarizer and for loop 为 pandas dataframe 的每一行替换列中的字符串 - Replace a string in a column for each row of a pandas dataframe 使用 Loop 替换数据框中列中的每一行,并使用它出现的实例 - Using Loop to replace each row in a column in a dataframe wtih the instance that it appears 如何使用 pandas 循环遍历一行中的每一列 - how to loop through each column in a row using pandas Python Pandas DataFrame:根据条件替换每一列的每一行中的值 - Python Pandas DataFrame: Replace values in each row for each column based on conditions 如何使用熊猫循环:'对于文件中的每一行,对于行中的每一列' - How to loop with pandas: 'for each row in file, for each column in row' 仅当整行(每列)具有 NaN 值时,将 pandas 中的行替换为下一行 - Replace row in pandas with next row only when the entire row (each column) has NaN values 熊猫替换每列中的某些值 - Pandas Replace certain values in each column Pandas .isin()用于列的每一行中的值列表 - Pandas .isin() for list of values in each row of a column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM