[英]Replacing number on the edges
我只有0和127s的數據框。 如示例中所示,這127個群集在一起。
df = DataFrame({'f1' : [0,0,0,0,0,0],
'f2' : [0,0,0,0,0,0],
'f3' : [0,0,127,127,0,0],
'f4' : [0,127,127,127,0,0],
'f5' : [0,127,127,127,127,0],
'f6' : [0,127,127,127,127,0],
'f7' : [0,0,127,127,127,0],
'f8' : [0,0,127,127,0,0],
'f9' : [0,0,127,0,0,0],
'f10' : [0,0,0,0,0,0]
})
f1 f2 f3 f4 f5 f6 f7 f8 f9 f10
0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 127 127 127 0 0 0 0
2 0 0 127 127 127 127 127 127 127 0
3 0 0 127 127 127 127 127 127 0 0
4 0 0 0 0 127 127 127 0 0 0
5 0 0 0 0 0 0 0 0 0 0
給定數字num_of_cells_to_del
的列表,我想隨機randomly from top or bottom
隨機清除特定列中的許多單元格。
num_of_cells_to_del = [0,0,0,1,1,2,2,1,0,0]
結果:
f1 f2 f3 f4 f5 f6 f7 f8 f9 f10
0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 127 0 0 0 0 0
2 0 0 127 127 127 0 0 0 127 0
3 0 0 127 127 127 127 127 127 0 0
4 0 0 0 0 0 127 0 0 0 0
5 0 0 0 0 0 0 0 0 0 0
不能完全理解您的示例嗎?您要從上至下還是從行向左放置0。 如果為第一,則結果不正確;如果為第二,則num_of_cells_to_del中的值不足
無論如何,下面的代碼都適用於:
import pandas as pd
df = pd.DataFrame({'f1' : [0,0,0,0,0,0],
'f2' : [0,0,0,0,0,0],
'f3' : [0,0,127,127,0,0],
'f4' : [0,127,127,127,0,0],
'f5' : [0,127,127,127,127,0],
'f6' : [0,127,127,127,127,0],
'f7' : [0,0,127,127,127,0],
'f8' : [0,0,127,127,0,0],
'f9' : [0,0,127,0,0,0],
'f10' : [0,0,0,0,0,0]
})
print(df)
f1 f2 f3 f4 f5 f6 f7 f8 f9 f10
0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 127 127 127 0 0 0 0
2 0 0 127 127 127 127 127 127 127 0
3 0 0 127 127 127 127 127 127 0 0
4 0 0 0 0 127 127 127 0 0 0
5 0 0 0 0 0 0 0 0 0 0
num_of_cells_to_del = [0,1,1,2,2,0]
for i, r in enumerate(df.iterrows()):
if i<len(num_of_cells_to_del):
df.iloc[0:num_of_cells_to_del[i],i]=0
print(df)
f1 f2 f3 f4 f5 f6 f7 f8 f9 f10
0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 127 0 0 0 0
2 0 0 127 127 127 127 127 127 127 0
3 0 0 127 127 127 127 127 127 0 0
4 0 0 0 0 127 127 127 0 0 0
5 0 0 0 0 0 0 0 0 0 0
for i, c in enumerate(df.keys()):
if i<len(num_of_cells_to_del):
df.loc[0:num_of_cells_to_del[i],c]=0
print(df)
f1 f2 f3 f4 f5 f6 f7 f8 f9 f10
0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 127 0 0 0 0
2 0 0 127 0 0 127 127 127 127 0
3 0 0 127 127 127 127 127 127 0 0
4 0 0 0 0 127 127 127 0 0 0
5 0 0 0 0 0 0 0 0 0 0
for i, c in enumerate(df.keys()):
if i<len(num_of_cells_to_del):
if np.random.rand()>0.5:
df.loc[0:num_of_cells_to_del[i],c]=0
elif num_of_cells_to_del[i]>0:
df.loc[-num_of_cells_to_del[i]:,c]=0
print(df)
f1 f2 f3 f4 f5 f6 f7 f8 f9 f10
0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0
2 0 0 127 127 127 0 0 127 127 0
3 0 0 127 127 127 127 127 127 0 0
4 0 0 0 0 127 127 127 0 0 0
5 0 0 0 0 0 0 0 0 0 0
我的解決方案
for col, cells in zip(df.columns, num_of_cells_to_del):
col_vals = df[col].values
non_zero = np.where(col_vals == 127)[0] # find which indices have 127
if len(non_zero) < cells: # can't delete more that what's present!
raise Exception('Not enough 127 in the column!')
if len(non_zero) == 0:
continue
replace_indices = np.random.choice(non_zero, size=cells, replace=False) # choose random indices to delete
col_vals[replace_indices] = 0
df[col] = col_vals
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.