简体   繁体   English

仅基于列中的非 NaN 值在 dataframe 中创建新行

[英]Create a new row in a dataframe based only for non NaN values from a column

Lets say i have i dataframe like this one:可以说我有像这样的 dataframe :

      col1       col2         col3
0     data1   Completed       Fail
1     data2   Completed       NaN
2     data3   Completed    Completed
3     data4   Completed       NaN
4     data5      NaN          NaN

How can i add an extra row for each time the value in col3 is not NaN and have a dataframe like this:每次 col3 中的值不是 NaN 并且具有这样的 dataframe 时,我如何添加额外的行:

      col1     status           
0     data1   Completed 
1     data1      Fail
2     data2   Completed     
3     data3   Completed    
4     data3   Completed
5     data4   Completed      
6     data5      NaN        

I tried this but im not getting the desirable output:我试过这个,但我没有得到理想的 output:

df  = df.melt(id_vars=['col1'],  
        value_name="status")

IIUC, you can first start by using pd.melt() as you already did but also drop all the null values by chaining dropna() . IIUC,您可以首先使用pd.melt()开始,但也可以通过链接dropna()删除所有 null 值。 This will get you close, but not exactly where you want to be:这将使您接近,但不完全是您想要的位置:

new = df.melt(id_vars='col1',value_name='status').sort_values(by='col1').dropna().drop('variable',axis=1)

>>> print(new)

    col1     status
0  data1  Completed
5  data1       Fail
1  data2  Completed
2  data3  Completed
7  data3  Completed
3  data4  Completed

At this point, you will need to bring over the rows from your original df that were nan in col2.此时,您将需要从原始df中带入 col2 中的nan行。 You can do that usingisnull() andpd.concat() respectively:您可以分别使用isnull()pd.concat()来做到这一点:

col2_nan = df.loc[df.col2.isnull()].drop('col3',axis=1).rename(columns = {'col2':'status'})

>>> print(pd.concat([new,col2_nan]).reset_index(drop=True))


    col1     status
0  data1  Completed
1  data1       Fail
2  data2  Completed
3  data3  Completed
4  data3  Completed
5  data4  Completed
6  data5        NaN

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用基于(非唯一)列值的其他行中的值替换 DataFrame 行中的 NaN 值 - Replacing NaN values in a DataFrame row with values from other rows based on a (non-unique) column value 根据另一列中的“NaN”值在 Pandas Dataframe 中创建一个新列 - Create a new column in Pandas Dataframe based on the 'NaN' values in another column 根据其他列中的“NaN”值在 Pandas Dataframe 中创建一个新列 - Create a new column in Pandas Dataframe based on the 'NaN' values in other columns 如何在一列中使用来自其他数据帧的所有非 NaN 值创建新的 DataFrame - How to create new DataFrame with all non-NaN values from other DataFrames in one column 仅从数据框中的一行中删除 Nan 值 - Drop only Nan values from a row in a dataframe 根据上一行的值在熊猫数据框中创建一个新列 - Create a new column in a pandas dataframe based on values found on a previous row 根据提供NaN值的数据框中的现有列添加新列 - Adding the new column based on existing column in dataframe giving NaN values 根据特定列中存在的NaN在python数据框中创建一个新列 - Create a new column in python dataframe based on the presence of NaN in a specific column 基于其他两个包含 NaN 值的列在 pandas DataFrame 中创建一个新列 - create a new column in pandas DataFrame based on two others which contain NaN values 根据多列中的行值创建新的数据框列 - Creating a new dataframe column based on row values from multiple columns
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM