简体   繁体   English

熊猫 - 有条件地存储输出

[英]Pandas - Store output conditionally

I have a df.apply which return a series with Key/Value like below (let's call it column A): 我有一个df.apply,返回一个Key / Value的系列,如下所示(让我们称之为A列):

         Col. A
40237    1.871111
40239    1.280556
40240    1.784167
40241    0.049167
40243    0.011389
40244    0.660278
40245    1.512500

I would like to save these values into various columns using the value of a separate column (column B). 我想使用单独列(B列)的值将这些值保存到各个列中。

Column B can contain 6 different distinct strings (4 in the example). B列可以包含6个不同的不同字符串(示例中为4个字符串)。

         Col. B
40237    Open
40239    Open
40240    In Progress
40241    Closed
40243    Waiting
40244    In Progress
40245    Waiting

I want to save my column A values into one of 4 columns based on value of column B. 我想根据B列的值将我的A列值保存到4列中的一列中。

End result is: 最终结果是:

         Col. A      Col. B        Open Time   In Progress Time  Closed Time Waiting Time
40237    1.871111    Open          1.871111    np.nan            np.nan      np.nan
40239    1.280556    Open          1.280556    np.nan            np.nan      np.nan
40240    1.784167    In Progress   np.nan      1.784167          np.nan      np.nan
40241    0.049167    Closed        np.nan      np.nan            0.049167    np.nan
40243    0.011389    Waiting       np.nan      np.nan            np.nan      0.011389
40244    0.660278    In Progress   np.nan      0.660278          np.nan      np.nan
40245    1.512500    Waiting       np.nan      np.nan            np.nan      1.512500

Now my best efforts to get this to work was: 现在,我尽最大努力使其发挥作用:

for key in output.index:
    df.loc[key,(df['Col. B'] + " Time")] = output.loc[key]

However my error is ValueError: cannot index with vector containing NA / NaN values . 但是我的错误是ValueError: cannot index with vector containing NA / NaN values I'm not sure why exactly, though my columns in general do have plenty of nan's. 我不确定为什么,尽管我的专栏一般都有很多南茜。

Use join of pivot ed DataFrame with add_suffix : 使用joinpivot与ED数据帧add_suffix

df = df.join(pd.pivot(df.index, df['Col. B'], df['Col. A']).add_suffix(' Time'))

Another solution is pivot by set_index with unstack : 另一种解决方案是通过枢轴set_indexunstack

df = df.join(df.set_index('Col. B', append=True)['Col. A'].unstack().add_suffix(' Time'))

print (df)
         Col. A       Col. B  Closed Time  In Progress Time  Open Time  \
40237  1.871111         Open          NaN               NaN   1.871111   
40239  1.280556         Open          NaN               NaN   1.280556   
40240  1.784167  In Progress          NaN          1.784167        NaN   
40241  0.049167       Closed     0.049167               NaN        NaN   
40243  0.011389      Waiting          NaN               NaN        NaN   
40244  0.660278  In Progress          NaN          0.660278        NaN   
40245  1.512500      Waiting          NaN               NaN        NaN   

       Waiting Time  
40237           NaN  
40239           NaN  
40240           NaN  
40241           NaN  
40243      0.011389  
40244           NaN  
40245      1.512500  

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM