简体   繁体   English

根据标识符用另一行的另一列的值填充空列

[英]Fill empty columns with values from another column of another row based on an identifier

I am trying to fill a dataframe, containing repeated elements, based on an identifier.我正在尝试根据标识符填充包含重复元素的 dataframe。 My Dataframe is as follows:我的 Dataframe 如下:

   Code Value
0  SJHV   
1  SJIO    96B
2  SJHV    33C
3  CPO3    22A
4  CPO3    22A
5  SJHV    33C       #< -- Numbers stored as strings
6   TOY   
7   TOY             #< -- These aren't NaN, they are empty strings

I would like to remove the empty 'Value' rows only if a non-empty 'Value' row exists.仅当存在非空“值”行时,我才想删除空的“值”行。 To be clear, I would want my output to look like:明确地说,我希望我的 output 看起来像:

   Code Value
0  SJHV    33C
1  SJIO    96B
2  CPO3    22A      
3   TOY         

My attempt was as follows:我的尝试如下:

df['Value'].replace('', np.nan, inplace=True)

df2 = df.dropna(subset=['Value']).drop_duplicates('Code')

As expected, this code also drops the 'TOY' Code.正如预期的那样,此代码还删除了“玩具”代码。 Any suggestions?有什么建议么?

The empty strings should go to the bottom if you sort them, then you can just drop duplicates.如果对它们进行排序,空字符串应该 go 到底部,然后你可以删除重复项。

import pandas as pd
df = pd.DataFrame({'Code':['SJHV','SJIO','SJHV','CPO3','CPO3','SJHV','TOY','TOY'],'Value':['','96B','33C','22A','22A','33C','','']})
df = (
    df.sort_values(by=['Value'], ascending=False)
      .drop_duplicates(subset=['Code'], keep='first')
      .sort_index()
)
    

Output Output

   Code Value
1  SJIO   96B
2  SJHV   33C
3  CPO3   22A
6   TOY      

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何根据另一列值填充空索引或空行? - How to fill empty index or empty row based on another column value? 根据Pandas中第二列的条件,用另一行的同一列的值填充特定行的列中的值 - Fill values in a column of a particular row with the value of same column from another row based on a condition on second column in Pandas 根据另一个数据框中的列填充数据框中的空值 - Fill empty values in a dataframe based on columns in another dataframe 如何根据另一个数据框中的列填充数据框中的空值? - How to fill empty values in a dataframe based on columns in another dataframe? 根据pandas中的另一个列值有条件地填充列值 - Conditionally fill column values based on another columns value in pandas 使用 pandas 根据来自另一个 dataframe 的行值填充列值 - fill column values based on row values from another dataframe using pandas 根据另一列中的值在一行中的空列中填充多个值 - Populating several values in an empty column in a row, based on a value from another column 根据同一行中另一列的值填充缺失值 - Fill missing value based on value from another column in the same row 根据另一列的平均值填充一列的值 - Fill values of a column based on mean of another column 填充基于另一列的唯一 ID 的缺失值 - Fill missing values based unique IDs from another column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM