Pandas groupby, return rows of 1 column based on maximum values of other columns

Question

In my data, I need to group by columns X,Y,Z and fill out the result code column. The values will be filled from code column based on max value of either area or new_area column.

So for first group, code C has maximum area. In that case, all rows for that group should be C. For the second group, since the max area is same, so checking the new_area column, the result should be code B.

I need to have these results in a separate column along with other columns as well.

The table in the pic will help clarify.

enter image description here

Answer 1

This is a simple case of sorting then taking first

df = pd.read_csv(io.StringIO("""X,Y,Z,code,area,new_area,result_code
222 North St,Seattle,WA,A,200,600,C
222 North St,Seattle,WA,B,300,700,C
222 North St,Seattle,WA,C,400,750,C
222 North St,Seattle,WA,D,300,600,C
115 John St,Chicago,IL,A,200,250,B
115 John St,Chicago,IL,B,200,300,B
115 John St,Chicago,IL,C,50,100,B"""))

df = (df.sort_values(["X","Y","Z","area","new_area"], ascending=[True,True,True,False,False])
      .assign(result_code=lambda dfa: dfa.groupby(["X","Y","Z"])["code"].transform("first"))
      .sort_index()
     )

df = (df.sort_values(["X","Y","Z","area","new_area"], ascending=[True,True,True,False,False])
      .assign(result_code=lambda dfa: dfa.groupby(["X","Y","Z"])["code"].transform("first"))
      .sort_index()
     )

output

	X	Y	Z	code	area	new_area	result_code
0	222 North St	Seattle	WA	A	200	600	C
1	222 North St	Seattle	WA	B	300	700	C
2	222 North St	Seattle	WA	C	400	750	C
3	222 North St	Seattle	WA	D	300	600	C
4	115 John St	Chicago	IL	A	200	250	B
5	115 John St	Chicago	IL	B	200	300	B
6	115 John St	Chicago	IL	C	50	100	B

Pandas groupby, return rows of 1 column based on maximum values of other columns

Question

1 answers

solution1
0 ACCPTED 2021-03-15 19:14:36

output

Pandas groupby, return rows of 1 column based on maximum values of other columns

Question

1 answers

solution1 0 ACCPTED 2021-03-15 19:14:36

output

solution1
0 ACCPTED 2021-03-15 19:14:36