[英]Combining 2 pandas dataframes to 1 multi-dimensional one
Dataframe 1 looks like this: Dataframe 1 看起来像这样:
df1 = pd.DataFrame(
{
"Farm ID": ["1", "2", "2", "3", "3"],
"Crop": ["Type A", "Type A", "Type B", "Type B", "Type B"],
"Area": [8, 4, 2, 3, 5],
"Diesel": [101, 215, 3, 0.6, 42],
}
)
df1 = df1.set_index(['Farm ID', 'Crop'])
df1
Dataframe 2 looks like this: Dataframe 2 看起来像这样:
df2 = pd.DataFrame(
{
"Name": ["Area", "Diesel"],
"GHG": [690, 8.5],
"LU": [2.2, 0.3],
}
)
df2 = df2.set_index('Name')
df2
I now need to combine both such that I receive the following information:我现在需要将两者结合起来,以便收到以下信息:
GHG LU
Farm ID Crop Name
1 Type A Area 8*690 8*2.2
Diesel 101*690 101*2.2
2 Type A Area 4*690 4*2.2
Diesel 215*690 215*2.2
Type B Area ....
Any suggestions welcome as I am completely clueless.欢迎任何建议,因为我完全一无所知。 I also take ideas if there are better ways to structure this.如果有更好的方法来构建它,我也会考虑。 I will have to do further analysis (eg aggregation by crop type or name, and similar) on the resulting dataframe and might think too complicated... Thanks a lot!我将不得不对生成的 dataframe 进行进一步分析(例如,按作物类型或名称进行聚合等),并且可能认为太复杂了......非常感谢!
We can do stack我们可以做堆栈
s = df1.stack()
out = df2.reindex(s.index.get_level_values(2)).mul(s.values,axis=0)
out.index = s.index
out
GHG LU
Farm ID Crop
1 Type A Area 5520.0 17.60
Diesel 858.5 30.30
2 Type A Area 2760.0 8.80
Diesel 1827.5 64.50
Type B Area 1380.0 4.40
Diesel 25.5 0.90
3 Type B Area 2070.0 6.60
Diesel 5.1 0.18
Area 3450.0 11.00
Diesel 357.0 12.60
You can stack
the dataframe and let pandas broadcast on the common index:您可以stack
dataframe 并让 pandas 在公共索引上广播:
df1.rename_axis('Name', axis=1).stack().mul(df2.T).T
Output: Output:
GHG LU
Farm ID Crop Name
1 Type A Area 5520.0 17.60
Diesel 858.5 30.30
2 Type A Area 2760.0 8.80
Diesel 1827.5 64.50
Type B Area 1380.0 4.40
Diesel 25.5 0.90
3 Type B Area 2070.0 6.60
Diesel 5.1 0.18
Area 3450.0 11.00
Diesel 357.0 12.60
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.