[英]Pandas: Conditionally insert rows into DataFrame while iterating through rows
While iterating through the rows of a specific column in a Pandas DataFrame, I would like to add a new row below the currently iterated row, if the cell in the currently iterated row meets a certain condition. 在迭代Pandas DataFrame中特定列的行时,如果当前迭代行中的单元格满足某个条件,我想在当前迭代行下面添加一个新行。
Say for example: 比如说:
df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 'B': [1500, 1500, 7000]})
DataFrame: 数据帧:
A B
0 0.15 1500
1 0.15 1500
2 0.70 7000
Attempt: 尝试:
y = 100 #An example scalar
i = 1
for x in df['A']:
if x is not None: #Values in 'A' are filled atm, but not necessarily.
df.loc[i] = [None, x*y] #Should insert None into 'A', and product into 'B'.
df.index = df.index + 1 #Shift index? According to this S/O answer: https://stackoverflow.com/a/24284680/4909923
i = i + 1
df.sort_index(inplace=True) #Sort index?
I haven't been able to succeed so far; 到目前为止我还没有成功; getting a shifted index numbering that doesn't start at 0, and rows seem not to be inserted in an orderly way: 得到一个不从0开始的移位索引编号,并且似乎没有以有序的方式插入行:
A B
3 0.15 1500
4 NaN 70
5 0.70 7000
I tried various variants of this, trying to use applymap
with a lambda function, but was not able to get it working. 我尝试了各种变体,尝试使用带有lambda函数的applymap
,但是无法使其正常工作。
Desired result: 期望的结果:
A B
0 0.15 1500
1 None 15
2 0.15 1500
3 None 15
4 0.70 7000
5 None 70
I believe you can use: 我相信你可以使用:
df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7],
'B': [1500, 1500, 7000],
'C': [100, 200, 400]})
v = 100
L = []
for i, x in df.to_dict('index').items():
print (x)
#append dictionary
L.append(x)
#append new dictionary, for missing keys ('B, C') DataFrame constructor add NaNs
L.append({'A':x['A'] * v})
df = pd.DataFrame(L)
print (df)
A B C
0 0.15 1500.0 100.0
1 15.00 NaN NaN
2 0.15 1500.0 200.0
3 15.00 NaN NaN
4 0.70 7000.0 400.0
5 70.00 NaN NaN
It doesn't seem you need a manual loop here: 这似乎不需要手动循环:
df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 'B': [1500, 1500, 7000]})
y = 100
# copy slice of dataframe
df_extra = df.loc[df['A'].notnull()].copy()
# assign A and B series values
df_extra = df_extra.assign(A=np.nan, B=(df_extra['A']*y).astype(int))
# increment index partially, required for sorting afterwards
df_extra.index += 0.5
# append, sort index, drop index
res = df.append(df_extra).sort_index().reset_index(drop=True)
print(res)
A B
0 0.15 1500
1 NaN 15
2 0.15 1500
3 NaN 15
4 0.70 7000
5 NaN 70
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.