I am struggling with transforming rows to columns in Pandas, please review input data
below:
id match bookmaker home away
1 T1-T2 Bet365 1.5 2.4
1 T1-T2 Bwin 1.6 2.2
1 T1-T2 Betfair 1.7 2.3
2 T1-T3 Bet365 1.2 2.9
2 T1-T3 Bwin 1.2 2.8
2 T1-T3 Betfair 1.1 3.0
I need to transform it as new array
:
id match Bet365_home Bet365_away Bwin_home Bwin_away Betfair_home Betfair_away
1 T1-T2 1.5 2.4 1.6 2.2 1.7 2.3
2 T1-T3 1.2 2.9 1.2 2.8 1.1 3.0
If you can suggest how it can be done in PostgreSQL, also would be cool!
I don't know the SQL method but in pandas you want to pivot
:
In [233]:
df.pivot(index='id', columns = 'bookmaker')
Out[233]:
match home away
bookmaker Bet365 Betfair Bwin Bet365 Betfair Bwin Bet365 Betfair Bwin
id
1 T1-T2 T1-T2 T1-T2 1.5 1.7 1.6 2.4 2.3 2.2
2 T1-T3 T1-T3 T1-T3 1.2 1.1 1.2 2.9 3.0 2.8
To group by both the id
and the match
, you could use set_index
. If you also add bookmaker
to the index and then unstack
it:
import numpy as np
import pandas as pd
df = pd.read_table('data', sep='\s+')
df = df.set_index(['id', 'match', 'bookmaker']).unstack(['bookmaker'])
you will get
home away
bookmaker Bet365 Bwin Betfair Bet365 Bwin Betfair
id match
1 T1-T2 1.5 1.6 1.7 2.4 2.2 2.3
2 T1-T3 1.2 1.2 1.1 2.9 2.8 3.0
The hierarchical (MultiIndex) column
home away
Bet365 Bwin Betfair Bet365 Bwin Betfair
has more structure than the flat single-level column index:
Bet365_home Bet365_away Bwin_home Bwin_away Betfair_home Betfair_away
It makes selection or grouping by home
or away
easier than if the column index were flat. In general I think it is a better format for the DataFrame.
However, if you'd like to have a flat column index:
df = df.swaplevel(0, 1, axis=1)
df = df.reindex(columns='Bet365 Bwin Betfair'.split(), level=0)
df.columns = ['{}_{}'.format(bet, hw) for bet, hw in df.columns]
pd.options.display.width = 100
print(df)
yields
Bet365_home Bet365_away Bwin_home Bwin_away Betfair_home Betfair_away
id match
1 T1-T2 1.5 2.4 1.6 2.2 1.7 2.3
2 T1-T3 1.2 2.9 1.2 2.8 1.1 3.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.