[英]Python Pandas Create New Column With Values Coming From Other Columns Of The Same DF
It's been a long night searching for a solution, I appreciate your help.寻找解决方案是一个漫长的夜晚,感谢您的帮助。
Having the following df具有以下 df
proposal1_amount提案1_金额 | proposal2_amount提案2_金额 | proposal3_amount提案3_金额 | accepted_proposal接受的建议 |
---|---|---|---|
1000 1000 | 2000 2000 | 3000 3000 | 3 3 |
5000 5000 | 5200 5200 | 4000 4000 | 2 2 |
3000 3000 | 2400 2400 | 1120 1120 | 1 1 |
I need to build a new column with the amount coming from the accepted corresponding column, it would be like this:我需要使用来自接受的相应列的金额构建一个新列,它会是这样的:
proposal1_amount提案1_金额 | proposal2_amount提案2_金额 | proposal3_amount提案3_金额 | accepted_proposal接受的建议 | accepted_amount接受金额 |
---|---|---|---|---|
1000 1000 | 2000 2000 | 3000 3000 | 3 3 | 3000 3000 |
5000 5000 | 5200 5200 | 4000 4000 | 2 2 | 5200 5200 |
1450 1450 | 2400 2400 | 1120 1120 | 1 1 | 1450 1450 |
I've found some examples which work fine when the new column has a fixed value, but in this case the value comes from another column on the same df.我发现一些示例在新列具有固定值时可以正常工作,但在这种情况下,该值来自同一 df 上的另一列。
thanks, vv谢谢,vv
Quickest solution I could think of:我能想到的最快的解决方案:
df['accepted_amount'] = df.apply(lambda row: row.iloc[row['accepted_proposal']-1],axis=1)
Edit: Because I feel un-easy about the solution being contingent upon the ordering of the columns, here's a slightly wordier yet more dynamic solution:编辑:因为我对取决于列顺序的解决方案感到不安,所以这里有一个稍微冗长但更动态的解决方案:
df['accepted_amount']=df.apply(lambda row: row[['proposal1_amount','proposal2_amount','proposal3_amount']].iloc[row['accepted_proposal']-1],axis=1)
You can use numpy.choose
to do this pretty easily.您可以使用numpy.choose
轻松完成此操作。
print(df)
proposal1_amount proposal2_amount proposal3_amount accepted_proposal
0 1000 2000 3000 3
1 5000 5200 4000 2
2 3000 2400 1120 1
# create 2d array of our choices (which corresponds to our amounts)
choices = df.filter(regex="proposal\d_amount").to_numpy()
# subtract 1 from "accepted_proposal" so they line up with indices in choices array
# (we want these 0-indexed, not 1-indexed)
a = df["accepted_proposal"] - 1
# np.choose does all the heavy lifting, assign output to new column
df["accepted_amount"] = np.choose(a, choices)
print(df)
proposal1_amount proposal2_amount proposal3_amount accepted_proposal accepted_amount
0 1000 2000 3000 3 3000
1 5000 5200 4000 2 5200
2 3000 2400 1120 1 3000
np.choose
will functionally iterate over each row of choices (eg iterate over each "proposalN_amount") and then take the amount that matches the index from accepted_proposal - 1
. np.choose
将在功能上迭代每行选择(例如迭代每个“proposalN_amount”),然后从accepted_proposal - 1
。 See the docs for np.choose请参阅np.choose的文档
proposal1_amount=[1000,5000,1450]
proposal2_amount=[2000,5200,2400]
proposal3_amount=[3000,4000,1120]
accepted_proposal=[3,2,1]
df=pd.DataFrame({'proposal1_amount': proposal1_amount,'proposal2_amount': proposal2_amount,'proposal3_amount':proposal3_amount,'accepted_proposal':accepted_proposal})
df['accepted_proposal']=df['accepted_proposal'].astype(int)
df=df.assign(accepted_amount=df.apply(lambda row: row.iloc[row['accepted_proposal']-1], axis=1))
print(df)
output:
proposal1_amount proposal2_amount proposal3_amount accepted_proposal
0 1000 2000 3000 3
1 5000 5200 4000 2
2 1450 2400 1120 1
accepted_amount
0 3000
1 5200
2 1450
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.