[英]Divide the array in one column by another column from a different dataframe in pandas
我有 2 個數據df_1
和df_2
:
df_1:
id number array_col
001 0 [0.084, 0.089, 0.047 ...]
002 0 [0.052, 0.036, 0.062 ...]
003 0 [0.087, 0.087, 0.051 ...]
. .
. .
100 0 [0.098, 0.089, 0.067 ...]
100 x 3
df_2:
id number array_col
001 1 [0.012, 0.023, 0.034 ...]
001 2 [0.045, 0.056, 0.067 ...]
002 1 [0.078, 0.089, 0.091 ...]
002 2 [0.021, 0.032, 0.043 ...]
. .
. .
100 2 [0.054, 0.065, 0.076 ...]
200 x 3
我的目標是為每個唯一id
更新array_col
中的df_2
,方法是將它們除以array_col
的df_1
以獲得相同的唯一id
。 我已經嘗試過以下方法,但它似乎沒有工作/更新df_2
中的列。
for unique_id in df_2['id'].unique():
for unique_number in [1, 2]:
df_2.loc[(df_2['id'] == unique_id) &
(df_2['number'] == unique_number)]['array_col'].values[0] =\
df_2.loc[(df_2['id'] == unique_id) &
(df_2['number'] == unique_number)]['array_col'].values[0] =\
df_1.loc[(df_1['id'] == unique_id) &
(df_1['number'] == 0)]['array_col'].values[0]
將不勝感激任何幫助。 如果您需要任何其他信息,請告訴我。
您可以apply()
對應array_col
列表的numpy.divide()
:
df_1 = df_1.set_index('id')
df_2.array_col = df_2.apply(lambda row:
np.divide(row.array_col, df_1.loc[row.id, 'array_col']),
axis=1)
# id number array_col
# 0 001 1 [0.14285714285714285, 0.25842696629213485, 0.7...
# 1 001 2 [0.5357142857142857, 0.6292134831460675, 1.425...
# 2 002 1 [1.5, 2.4722222222222223, 1.467741935483871]
# 3 002 2 [0.4038461538461539, 0.888888888888889, 0.6935...
# 4 100 2 [0.5510204081632653, 0.7303370786516854, 1.134...
供參考的樣本數據:
df_1 = pd.DataFrame({'id':['001','002','003','100'],'number':[0,0,0,0],'array_col':[[0.084,0.089,0.047]*200,[0.052,0.036,0.062]*200,[0.087,0.087,0.051]*200,[0.098,0.089,0.067]*200]})
df_2 = pd.DataFrame({'id':['001','001','002','002','100'],'number':[1,2,1,2,2],'array_col':[[0.012,0.023,0.034]*200,[0.045,0.056,0.067]*200,[0.078,0.089,0.091]*200,[0.021,0.032,0.043]*200,[0.054,0.065,0.076]*200]})
檢查這個是否有效
gp_df1=df_1.groupby('id')['array_col'].reset_index().rename(columns={'array_col':'array_col_1'})
df_2 = df_2.merge(gp_df1, on= 'id',how='left')
df_2['new_array_col']=df_2.array_col.div(df_2.array_col_1)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.