简体   繁体   中英

Pivot large table in pandas

I have a table like this:

df2 = pd.DataFrame(np.array([[1, 0, 3,0.2], [1, 100, 6,0.3], [7, 0, 9,0.7],[7, 100, 20,0.8]]),
                   columns=['id', 'time', 'val1','val2'])

        id  time    val1    val2
0      1.0  0.0     3.0     0.2
1      1.0  100.0   6.0     0.3
2      7.0  0.0     9.0     0.7
3      7.0  100.0   20.0    0.8

And i want to pivot (long table to wide table) and get something like this:

        id  val1time0   val2time0   val1time100 val2time100
0       1.0 3.0         0.2         6.0         0.3
2       7.0 9.0         0.7         20.0        0.8

My attempt was doing something like this:

df2=df2.merge(df2[df2["time"]==0][["id","val1","val2"]],left_on='id', right_on='id',suffixes=("", "time0"))
df2=df2.merge(df2[df2["time"]==100][["id","val1","val2"]],left_on='id', right_on='id',suffixes=("", "time100"))
df3=df2[["id","val1time0","val2time0","val1time100","val2time100"]]
df3.drop_duplicates()

And it works but when i have a large table, more than 80000 values in the first table, my kernel die, so can you help me?

Thank you

Hope this will help:


import pandas as pd
import numpy as np

df2 = pd.DataFrame(np.array([[1, 0, 3,0.2], [1, 100, 6,0.3], [7, 0, 9,0.7],[7, 100, 20,0.8]]),
                   columns=['id', 'time', 'val1','val2'])

df2["time"] = "time"+df2["time"].astype(str)

df3 = (df2.set_index(["id"])
                          .pivot(columns='time')[['val1', 'val2']]
                          .reset_index())

df3.columns = df3.columns.to_flat_index()

Output:

index id val1time0.0 val1time100.0 val2time0.0 val2time100.0
0 1.0 3.0 6.0 0.2 0.3
1 7.0 9.0 20.0 0.7 0.8

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM