[英]inner join with group by pandas python
I have 2 dataframes named geostat and ref, the dataframes are as follows: 我有两个名为geostat和ref的数据框,这些数据框如下:
geostat:
count percent grpno. state code
0 14.78 1 CA
1 0.00 2 CA
2 8.80 3 CA
3 9.60 4 FL
4 55.90 4 MA
5 0.00 2 FL
6 0.00 6 NC
7 0.00 5 NC
8 6.90 1 FL
9 59.00 4 MA
res:
grpno. MaxOfcount percent
0 1 14.78
1 2 0.00
2 3 8.80
3 4 59.00
4 5 0.00
5 6 0.00
I want to select the first(res.Maxofcount percent), res.grpno., and geostat.first(statecode) from the dataframe geostat and res inner join on columns res.Maxofcount percent = geostat.count percent AND res. 我想从数据框geostat中选择first(res.Maxofcount百分比),res.grpno。和geostat.first(状态码),并在res.Maxofcount百分比= geostat.count百分比AND res列上进行res内部联接。 grpno.
grpno。 = geostat.grpno.
= geostat.grpno。 group by res.grpno.
按res.grpno分组。
I want to do this python pandas, I am not sure on how to do inner join with group by.Can anyone help me on this? 我想做这个python pandas,我不确定如何使用group by进行内部联接,有人可以帮我吗?
The output dataframe is given below: 输出数据帧如下:
FirstOfMaxOfState count percent state pool number FirstOfstate code
0 14.78 1 CA
1 0.00 2 CA
2 8.80 3 CA
3 59.00 4 MA
4 0.00 5 NC
5 0.00 6 NC
NOTE: FIRST(Column name) is an access function what should be equivalent of it in python? 注意:FIRST(Column name)是一个访问函数,在python中应该等效于什么?
EDITED: Changed the output dataframe. 编辑:更改了输出数据框。
Use pandas.DataFrame.merge()
使用
pandas.DataFrame.merge()
geostat.merge(res, left_on=['count percent', 'grpno.'], right_on=['MaxOfcount percent', 'grpno.'],how='inner')
count percent grpno. state code MaxOfcount percent
0 14.78 1 CA 14.78
1 0.00 2 CA 0.00
2 0.00 2 FL 0.00
3 8.80 3 CA 8.80
4 0.00 6 NC 0.00
5 0.00 5 NC 0.00
6 59.00 4 MA 59.00
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.