简体   繁体   中英

How to de-normalize data with pandas dataframe

I have a pandas dataframe created out of CSV file. The dataframe looks like this

srvr_name log_type       hour  
server1   impressionWin  18:00:00 
server1   transactionWin 18:00:00 
server2   impressionWin  18:00:00 
server2   transactionWin 18:00:00 

What I would like to get from this is:

srvr_name impressionWin transactionWin hour
server1   true          true           18:00:00
server2   true          true           18:00:00 

What is the best way to achieve this in pandas?

Using join with get_dummies

df.join(pd.get_dummies(df.log_type)).groupby(['srvr_name', 'hour']).sum().astype(bool)

                    impressionWin  transactionWin
srvr_name hour
server1   18:00:00           True            True
server2   18:00:00           True            True

You can use this:

df = pd.crosstab([df.srvr_name, df.hour], df.log_type).astype(bool).rename_axis(None, 1).reset_index()

Output:

  srvr_name      hour  impressionWin  transactionWin
0   server1  18:00:00           True            True
1   server2  18:00:00           True            True

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM