簡體   English   中英

從現有列創建新列

[英]Creating new column from existing columns

我有一個數據框

ID   P_1   P_2
1    NaN   NaN
2    124   342
3    NaN   234
4    123   NaN
5    2345  500

我想創建一個名為 P_3 的新列,以便:

ID   P_1   P_2  P_3
1    NaN   NaN   NaN
2    124   342   342
3    NaN   234   234
4    123   NaN   123
5    2345  500  500

我的條件是:

if P_1 = Nan , then P_3 == P_2
if P_1 != Nan and P_2 != Nan, then  P_3 == P_2
if P_2 = Nan , then P_3 == P_1

我應用了以下代碼:

conditions = [
    (df['P_1'] == float('NaN')),
    (df['P_1'] != float('NaN')) & (df['P_2'] != float('NaN')),
    (df['P_1'] != float('NaN')) & (df['P_2'] == float('NaN'))
    ]

values = [df['P_2'], df['P_2'], df['P_1']]

df['P_3'] = np.select(conditions, values)

但它給了我以下錯誤:

Length of values does not match length of index

總之,您的獨特條件是:

P_3 = P_2 if P_2 != NaN else P_1

combine_first :更新 null 元素,其值位於其他相同位置(參考:Pandas 文檔。)

>>> df["P_2"].combine_first(df["P_1"])
ID
1      NaN
2    342.0
3    234.0
4    123.0
5    500.0

另一種方法:

In [93]: df                                                                                                                                                                                                                                                                   
Out[93]: 
       p1     p2
0     NaN    NaN
1   124.0  342.0
2     NaN  234.0
3   123.0    NaN
4  2345.0  500.0

In [94]: df['p3'] = df.p2                                                                                                                                                                                                                                                     

In [95]: df                                                                                                                                                                                                                                                                   
Out[95]: 
       p1     p2     p3
0     NaN    NaN    NaN
1   124.0  342.0  342.0
2     NaN  234.0  234.0
3   123.0    NaN    NaN
4  2345.0  500.0  500.0

In [96]: df.loc[df.p3.isna(), 'p3'] = df[df.p3.isna()]['p1']                                                                                                                                                                                                                  

In [97]: df                                                                                                                                                                                                                                                                   
Out[97]: 
       p1     p2     p3
0     NaN    NaN    NaN
1   124.0  342.0  342.0
2     NaN  234.0  234.0
3   123.0    NaN  123.0
4  2345.0  500.0  500.0

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM