![](/img/trans.png)
[英]How to populate pandas DataFrame based on multiple columns and conditions?
[英]How to insert columns in a pandas dataframe based on multiple conditions?
我有一个看起来像的数据框:
df_test = pd.DataFrame({
0: ['Property1', 1.2, 1.5, 2.6],
1: ["Std", 0.1,0.01,0.02],
2: ["Rep", 3,3,3],
3: ["Property2", 3.1,18.2,10.66],
4: ["Property3", 22,33,44],
5: ["Rep", 3,3,3],
6: ["Property4", 6,12,14.23],
7: ["Property5", 3.1,18.2,10.66],
8: ["Std", 1,0.2,0.66],
})
0 1 2 3 4 5 6 7 8
0 Property1 Std Rep Property2 Property3 Rep Property4 Property5 Std
1 1.2 0.1 3 3.1 22 3 6 3.1 1
2 1.5 0.01 3 18.2 33 3 12 18.2 0.2
3 2.6 0.02 3 10.66 44 3 14.23 10.66 0.66
I want to insert columns if the value in df[1,i] is not ["Std", "Rep"] and df[1,i+1] is not "Std".
如果我遍历列(从末尾开始),它看起来像:
while i < j:
if df.loc[1,i] != "Std" and df.loc[1,i] != "Rep":
if df.loc[1,i+1] != "Std":
df.insert(i+1,"Std",np.nan)
df.columns = pd.RangeIndex(df.columns.size)
df.loc[1,i+1]="Std"
i+=1
j = len(df.columns)-1
我正在尝试使用 select 但我不知道如何设置依赖于两个连续列的条件,就像我在 while 循环中所做的那样。
预期结果:
0 1 2 3 4 5 6 7 8 9 \
0 Property1 Std Rep Property2 Std Property3 Std Rep Property4 Std
1 1.2 0.1 3 3.1 nan 22 nan 3 6 nan
2 1.5 0.01 3 18.2 nan 33 nan 3 12 nan
3 2.6 0.02 3 10.66 nan 44 nan 3 14.23 nan
10 11
0 Property5 Std
1 3.1 1
2 18.2 0.2
3 10.66 0.66
有没有办法矢量化这个循环?
让我们试试
s = df.iloc[0,:]
s = s.str.startswith('Property')&s.shift(-1).ne('Std')
out = pd.concat([df,pd.DataFrame([['Std',np.nan,np.nan,np.nan]]*s.sum(),
index = s.index[s],
columns=df.index).T],axis=1).sort_index(level=0,axis=1)
out
0 1 2 3 3 ... 5 6 6 7 8
0 Property1 Std Rep Property2 Std ... Rep Property4 Std Property5 Std
1 1.2 0.1 3 3.1 NaN ... 3 6 NaN 3.1 1
2 1.5 0.01 3 18.2 NaN ... 3 12 NaN 18.2 0.2
3 2.6 0.02 3 10.66 NaN ... 3 14.23 NaN 10.66 0.66
[4 rows x 12 columns]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.