[英]Compare data within a dataframe and add new column with modified data values in that column
Attached is my data frame and I want to compare column SOI priority and column %stake and form comment accordingly.附件是我的数据框,我想比较列 SOI 优先级和列 %stake 并相应地形成评论。 I tried the below code.
我尝试了下面的代码。
treasury_shares['Priority comment']=""
temp=round(treasury_shares['%Stake'] * 100, 0)
treasury_shares['%Stake'] = round(treasury_shares['%Stake'] * 100, 0).astype(str) + "%"
# treasury_shares["%Stake"] = treasury_shares["%Stake"].str.replace(".0", "")
treasury_shares = treasury_shares.reindex(
columns=["performance_id", "SOI priority", "Date", "issued_shares_as_reported",
"share_level",
"share_be", "%Stake","Priority comment"])
if((temp>10)&(treasury_shares['SOI priority']==1)):
treasury_shares['Priority comment'] = 'SOI'+treasury_shares['SOI priority']+'&Stake>10'
I am getting the following error.我收到以下错误。 line 1329, in nonzero raise ValueError( ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
第 1329 行,在非零中引发 ValueError(ValueError: Series 的真值不明确。使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()。
Attached is the data frame image附上数据框图片
+-------------+-------------+------------------+
| SOI_prority | %Stake | Priority_comment |
+-------------+-------------+------------------+
| 1 | 44% | SOI1&Stake>10% |
+-------------+-------------+------------------+
import pandas as pd
import numpy as np
data = {
'performance_id': ['ASD'],
'SOI priority': ['1'],
'Date': ['31-Mar-22'],
'issued_shares_as_reported': ['6,06,13,663'],
'share_level': ['2,55,85,542'],
'share_be': ['3,42,28,121'],
'%Stake': ['0.44'],
'Priority': ['P1'],
'Priority comment': ['SOI1 & Stake>10%'],
}
treasury_shares = pd.DataFrame(data)
treasury_shares['Priority comment'] = ""
temp = treasury_shares['%Stake'].astype(float) * 100
# print(temp)
# treasury_shares['%Stake'] = round(treasury_shares['%Stake'].astype(int) * 100, 0).astype(str) + "%"
# treasury_shares["%Stake"] = treasury_shares["%Stake"].str.replace(".0", "")
treasury_shares = treasury_shares.reindex(
columns=["performance_id", "SOI priority", "Date", "issued_shares_as_reported",
"share_level",
"share_be", "%Stake", "Priority comment"])
# creating conditional masks, where the condition that you want will be = 1, you can also use boolean like = True/False
treasury_shares['new_conditional'] = np.where(
(temp > 10) &
(treasury_shares['SOI priority'].astype('int32') == 1),
1, 0
).astype('int32')
# Using the mask for your conditionals, where the same column is changed
treasury_shares['Priority comment'] = np.where(treasury_shares['new_conditional'] == 1,
'SOI' + (treasury_shares[
'SOI priority']).astype('string') + '&Stake>10',
treasury_shares['Priority comment'])
print(treasury_shares['Priority comment'])
# Panda doesn't work with 'if' clause, this is built-in for python, but panda is not built-in
# if((temp>10)&(treasury_shares['SOI priority']==1)):
# treasury_shares['Priority comment'] = 'SOI'+treasury_shares['SOI priority']+'&Stake>10'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.