[英]Create column in DataFrame1 based on values from DataFrame2
I have two Dataframes, and would like to create a new column in DataFrame 1 based on DataFrame 2 values.我有两个数据框,想在 DataFrame 1 中基于 DataFrame 2 值创建一个新列。
But I dont want to join the two dataframes per say and make one big dataframe, but rather use the second Dataframe simply as a look-up.但我不想按说加入两个数据帧并制作一个大的 dataframe,而是将第二个 Dataframe 用作查找。
#Main Dataframe:
df1 = pd.DataFrame({'Size':["Big", "Medium", "Small"], 'Sold_Quantity':[10, 6, 40]})
#Lookup Dataframe
df2 = pd.DataFrame({'Size':["Big", "Medium", "Small"], 'Sold_Quantiy_Score_Mean':[10, 20, 30]})
#Create column in Dataframe 1 based on lookup dataframe values:
df1['New_Column'] = when df1['Size'] = df2['Size'] and df1['Sold_Quantity'] < df2['Sold_Quantiy_Score_Mean'] then 'Below Average Sales' else 'Above Average Sales!' end
One approach, is to use np.where
:一种方法是使用np.where
:
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'Size': ["Big", "Medium", "Small"], 'Sold_Quantity': [10, 6, 40]})
df2 = pd.DataFrame({'Size': ["Big", "Medium", "Small"], 'Sold_Quantiy_Score_Mean': [10, 20, 30]})
condition = (df1['Size'] == df2['Size']) & (df1['Sold_Quantity'] < df2['Sold_Quantiy_Score_Mean'])
df1['New_Column'] = np.where(condition, 'Below Average Sales', 'Above Average Sales!')
print(df1)
Output Output
Size Sold_Quantity New_Column
0 Big 10 Above Average Sales!
1 Medium 6 Below Average Sales
2 Small 40 Above Average Sales!
Given that df2
is sort of like a lookup based on Size, it would make sense if your Size column was its index:鉴于df2
有点像基于大小的查找,如果您的大小列是它的索引,那将是有意义的:
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'Size': ["Big", "Medium", "Small"], 'Sold_Quantity': [10, 6, 40]})
df2 = pd.DataFrame({'Size': ["Big", "Medium", "Small"], 'Sold_Quantiy_Score_Mean': [10, 20, 30]})
lookup = df2.set_index("Size")
You can then map the Sizes in df1
to their mean and compare each with the sold quantity:然后,您可以将df1
中的尺寸 map 计算为其平均值,并将每个尺寸与售出数量进行比较:
is_below_mean = df1["Sold_Quantity"] < df1["Size"].map(lookup["Sold_Quantiy_Score_Mean"])
and finally map the boolean values to the respective strings using np.where
最后 map 使用np.where
将 boolean 值赋给相应的字符串
df1["New_Column"] = np.where(is_below_mean, 'Below Average Sales', 'Above Average Sales!')
df1: df1:
Size Sold_Quantity New_Column
0 Big 10 Above Average Sales!
1 Medium 6 Below Average Sales
2 Small 40 Above Average Sales!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.