[英]Copy value from one column to another based on condition (using pandas)
I have a data frame as seen below...我有一个如下所示的数据框...
Month JUN JUL AUG SOI_Final JUN_bool JUL_bool AUG_bool
Aug 1.2 0.8 0.1 NaN False False True
Aug 0.2 0 -2 NaN False False True
Jun 3.2 -2.5 0.6 NaN True False False
Jul 2.2 -0.7 -0.8 NaN False True False
What I'm trying to do is for each row in the table, lookup what month is in the 'Month' column and assign the appropriate value from columns JUN, JUL or AUG to 'SOI_Final'.我要做的是为表中的每一行查找“月份”列中的月份,并将 JUN、JUL 或 AUG 列中的适当值分配给“SOI_Final”。 For instance if column 'Month' is 'Jun' for a given row, then 'SOI_Final' for that row will get the value from column 'JUN'.例如,如果给定行的“Month”列是“Jun”,则该行的“SOI_Final”将从“JUN”列获取值。 Here is the code I got so far...这是我到目前为止得到的代码......
df_merged['JUN_bool'] = (df_merged['Month'] == 'Jun')
df_merged['JUL_bool'] = (df_merged['Month'] == 'Jul')
df_merged['AUG_bool'] = (df_merged['Month'] == 'Aug')
if df_merged['JUN_bool'] is True:
df_merged['SOI_Final']=df_merged['JUN']
elif df_merged['JUL_bool'] is True:
df_merged['SOI_Final']=df_merged['JUL']
elif df_merged['AUG_bool'] is True:
df_merged['SOI_Final']=df_merged['AUG']
else:
df_merged['SOI_Final']=np.NaN
My dataframe is only showing NaN's for 'SOI_Final' and is not picking up the correct value.我的 dataframe 只显示 'SOI_Final' 的 NaN 并且没有获得正确的值。 I created a Boolean column for each of the 3 months and the correct monthly value should only be copied over if the bool value is 'True'.我为 3 个月的每个月创建了一个 Boolean 列,并且只有在布尔值为“真”时才应复制正确的月度值。 Does anyone have any suggestions as to what I might be missing here?有没有人对我可能在这里遗漏的内容有任何建议?
Thanks, Jeff谢谢,杰夫
The problem here is that each of bool columns, ie df_merged['JUN_bool']
are series so the comparison whith is
operator will never return just True
, so evertyhing is assigned as nan.这里的问题是每个 bool 列,即df_merged['JUN_bool']
都是系列,因此与is
运算符的比较永远不会只返回True
,因此evertyhing 被分配为 nan。
If months values are aligned with columns you can do some capitalization and use stack method, only if indexes are unique this is a three months example:如果月份值与列对齐,则可以进行一些大写并使用堆栈方法,仅当索引是唯一的时,这是一个三个月的示例:
np.random.seed(10)
months = np.random.choice(['Aug', 'Jun', 'Jul'], 100)
JUN = np.random.random(100)
JUL = np.random.random(100)
AUG = np.random.random(100)
index = [i for i in range(1900, 2000)]
data = pd.DataFrame(dict(months=months, JUN=JUN, JUL=JUL, AUG=AUG), index=index)
Do the modification to the months column and boolean masks:对月份列和 boolean 掩码进行修改:
data['months'] = data.months.str.upper()
df2 = data[['JUN', 'JUL', 'AUG']].stack(
).reset_index(level=1)
df2.rename(columns={0: 'month_value'}, inplace=True)
df2['months'] = data['months']
SOI = df2[df2['months'] == df2['level_1']].month_value
data['SOI'] = SOI
data.head(4)
# months JUN JUL AUG SOI
# 1900 JUN 0.637952 0.933852 0.384843 0.637952
# 1901 JUN 0.372520 0.558900 0.820415 0.372520
# 1902 AUG 0.002407 0.672449 0.895022 0.895022
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.