简体   繁体   English

根据 pandas dataframe 中的相邻列将 NaN 值替换为特定文本

[英]Replace NaN values with specific text based on adjacent column in pandas dataframe

My data has some NaN values in its first column.我的数据在其第一列中有一些 NaN 值。 To replace these NaN values, I want to look to the adjacent value in the next column and put a specific value as replaceNaN.要替换这些 NaN 值,我想查看下一列中的相邻值并将特定值作为 replaceNaN。 Here is the csv data:这是 csv 数据:

,Release,SubRelease,ReleaseDate,TypeofRelease,Package
0,LTE,TL101,2017-09-27,Major Update,2.0.4
1,LTE,TL101,2017-09-26,Normal Update,3.1
2,NaN,TL209,2017-09-25,Major Update,3.2
3,5G,5GS,2017-09-25,Delivery,1.1
4,NaN,5GM,2017-09-24,Release,1.0
5,LTE,FL18A,2017-09-23,Normal Update,3.0

For eg, there is a NaN value in the 3rd row of Release .例如,在Release的第 3 行中有一个 NaN 值。 I want to look at the SubRelease column of the same row, and say that since the value here is "TL209", I want to replace the NaN with the value "LTE".我想看看同一行的SubRelease列,说既然这里的值是“TL209”,我想把NaN换成值“LTE”。 Similarly, if the value in SubRelease column is "5G19", I want to replace the NaN with "5G" for Release .同样,如果SubRelease列中的值为“5G19”,我想用“5G”替换Release的 NaN。

The first thing that comes to my mind is using regex, specify to look if the value in SubRelease column contains or begins with the text "5G".我首先想到的是使用正则表达式,指定查看SubRelease列中的值是否包含或以文本“5G”开头。 But I don't know how to implement this.但我不知道如何实现这一点。 Or is there any better approach?或者有没有更好的方法?

Easier to view csv data:更容易查看 csv 数据:

csv数据

You can just do:你可以这样做:

df["Release"] = df["Release"].fillna(df["SubRelease"])
df
>>>   Release SubRelease ReleaseDate  TypeofRelease Package
0     LTE      TL101  2017-09-27   Major Update   2.0.4
1     LTE      TL101  2017-09-26  Normal Update     3.1
2   TL209      TL209  2017-09-25   Major Update     3.2
3      5G        5GS  2017-09-25       Delivery     1.1
4     5GM        5GM  2017-09-24        Release     1.0
5     LTE      FL18A  2017-09-23  Normal Update     3.0

Edit I misread the question so forgot this last step:编辑我误读了这个问题,所以忘记了最后一步:

df = df.replace({"Release":{"TL209":"LTE", "5GM":"5G", "5GS":"5G", ...}})
>>>   Release SubRelease ReleaseDate  TypeofRelease Package
0     LTE      TL101  2017-09-27   Major Update   2.0.4
1     LTE      TL101  2017-09-26  Normal Update     3.1
2     LTE      TL209  2017-09-25   Major Update     3.2
3      5G        5GS  2017-09-25       Delivery     1.1
4      5G        5GM  2017-09-24        Release     1.0
5     LTE      FL18A  2017-09-23  Normal Update     3.0

To replace all NaNs in the Release column depending on the values in SubRelease , you can find eg all '5G' sub-releases and replace these NaNs first.要根据SubRelease中的值替换Release列中的所有 NaN,您可以找到例如所有“5G”子版本并首先替换这些 NaN。 If there are more conditions, these can be replaced in the same way.如果有更多的条件,这些可以以相同的方式替换。 In the end, replace any remaining NaNs with the default value (here 'LTE').最后,将所有剩余的 NaN 替换为默认值(此处为“LTE”)。

This can be done using loc together with an appropriate mask:这可以使用loc和适当的掩码来完成:

df.loc[df['Release'].isna() & df['SubRelease'].str.contains('5G'), 'Release'] = '5G'
df = df.fillna('LTE')

Result:结果:

  Release SubRelease ReleaseDate  TypeofRelease Package
0     LTE      TL101  2017-09-27   Major Update   2.0.4
1     LTE      TL101  2017-09-26  Normal Update     3.1
2     LTE      TL209  2017-09-25   Major Update     3.2
3      5G        5GS  2017-09-25       Delivery     1.1
4      5G        5GM  2017-09-24        Release     1.0
5     LTE      FL18A  2017-09-23  Normal Update     3.0

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 pandas Dataframe基于键列,将NaN值替换为以前的值 - pandas Dataframe Replace NaN values with with previous value based on a key column 根据条件,用相应的列名替换 pandas 数据框中的特定值, - Replace specific values in pandas dataframe with the corresponding column name, based on a condition, Pandas Dataframe 用 B 列的值替换 A 列的 NaN 值 - Pandas Dataframe replace NaN values of column A with values from column B 如何在 Pandas 数据框中用 NaN 选择和替换特定值。 如何从每个级别 1 多索引中删除一列 - How to select, and replace specific values with NaN in pandas dataframe. How to remove a column from each level 1 multiindex 熊猫:如果数据框列为“ NaN”,则替换该列 - Pandas: Replace dataframe column if it is `NaN` Pandas:如何根据另一列替换列中的 Nan 值? - Pandas: How to replace values of Nan in column based on another column? 根据另一列中的“NaN”值在 Pandas Dataframe 中创建一个新列 - Create a new column in Pandas Dataframe based on the 'NaN' values in another column 如何基于另一列的NaN值设置熊猫数据框中的值? - How set values in pandas dataframe based on NaN values of another column? pandas DataFrame:将nan值替换为对应列的中值 - pandas DataFrame: replace nan values with median of corresponding column 如何在 Pandas 数据框的列中用零替换 NaN 值? - How to replace NaN values by Zeroes in a column of a Pandas Dataframe?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM