繁体   English   中英

如何根据多个条件在 Pandas 数据框中插入列?

[英]How to insert columns in a pandas dataframe based on multiple conditions?

我有一个看起来像的数据框:

df_test = pd.DataFrame({
    0: ['Property1', 1.2, 1.5, 2.6], 
    1: ["Std", 0.1,0.01,0.02],
    2: ["Rep", 3,3,3],
    3: ["Property2", 3.1,18.2,10.66],
    4: ["Property3", 22,33,44],
    5: ["Rep", 3,3,3],
    6: ["Property4", 6,12,14.23],
    7: ["Property5", 3.1,18.2,10.66],
    8: ["Std", 1,0.2,0.66],
})
         0     1    2          3          4    5          6          7     8
0  Property1   Std  Rep  Property2  Property3  Rep  Property4  Property5   Std
1        1.2   0.1    3        3.1         22    3          6        3.1     1
2        1.5  0.01    3       18.2         33    3         12       18.2   0.2
3        2.6  0.02    3      10.66         44    3      14.23      10.66  0.66
I want to insert columns if the value in df[1,i] is not ["Std", "Rep"] and df[1,i+1] is not "Std".

如果我遍历列(从末尾开始),它看起来像:

while i < j:
    if df.loc[1,i] != "Std" and df.loc[1,i] != "Rep":
        if df.loc[1,i+1] != "Std":
            df.insert(i+1,"Std",np.nan)
            df.columns = pd.RangeIndex(df.columns.size)
            df.loc[1,i+1]="Std"
    i+=1
    j = len(df.columns)-1

我正在尝试使用 select 但我不知道如何设置依赖于两个连续列的条件,就像我在 while 循环中所做的那样。

预期结果:

          0     1    2          3    4          5    6    7          8    9   \
0  Property1   Std  Rep  Property2  Std  Property3  Std  Rep  Property4  Std   
1        1.2   0.1    3        3.1  nan         22  nan    3          6  nan   
2        1.5  0.01    3       18.2  nan         33  nan    3         12  nan   
3        2.6  0.02    3      10.66  nan         44  nan    3      14.23  nan   

          10    11  
0  Property5   Std  
1        3.1     1  
2       18.2   0.2  
3      10.66  0.66  

有没有办法矢量化这个循环?

让我们试试

s = df.iloc[0,:]
s = s.str.startswith('Property')&s.shift(-1).ne('Std')
out = pd.concat([df,pd.DataFrame([['Std',np.nan,np.nan,np.nan]]*s.sum(),
                                 index = s.index[s],
                                 columns=df.index).T],axis=1).sort_index(level=0,axis=1)
out
           0     1    2          3    3  ...    5          6    6          7     8
0  Property1   Std  Rep  Property2  Std  ...  Rep  Property4  Std  Property5   Std
1        1.2   0.1    3        3.1  NaN  ...    3          6  NaN        3.1     1
2        1.5  0.01    3       18.2  NaN  ...    3         12  NaN       18.2   0.2
3        2.6  0.02    3      10.66  NaN  ...    3      14.23  NaN      10.66  0.66
[4 rows x 12 columns]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM