简体   繁体   中英

Pandas: how to transpose this wide format data into long format

I am looking for an efficient way to convert a data from a "wide" structure into a dataframe.

The data: We have different observations, and each observation has features data for different days. The ideal dataframe will have the following columns: observation_id, day, feature_1, feature_2, ....

The current data structure: A list of dictionaries. Each dictionary represents an observation. In each dictionary we have the key "observation_id", and pasted keys for day + feature_id for all days and all features.

For example:

# Input:
x = [{'observation_id': '1', '0.feature_1': 2, '0.feature_2': 2, '1.feature_1': 3, '1.feature_2': 1},
     {'observation_id': '2', '0.feature_1': 7, '0.feature_2': 3, '1.feature_1': 4, '1.feature_2': 2},
     {'observation_id': '3', '0.feature_1': 5, '0.feature_2': 2, '1.feature_1': 5, '1.feature_2': 3}]

# Desired output:
  observation_id  day  feature_1  feature_2
0              1    0          2          2
1              1    1          3          1
2              2    0          7          3
3              2    1          4          2
4              3    0          5          2
5              3    1          5          3
                                                        

I tried the following but it doesn't give the desired result

df = pd.DataFrame(x)    
pd.wide_to_long(df, stubnames=["0", "1"], i="observation_id", j="feature", sep=".", suffix='\w+').reset_index()

# output:
  observation_id    feature  0  1
0              1  feature_1  2  3
1              2  feature_1  7  4
2              3  feature_1  5  5
3              1  feature_2  2  1
4              2  feature_2  3  2
5              3  feature_2  2  3

Any idea how to do this?

Thanks!

Try to add stack and unstack

df = pd.DataFrame(x)    
yourdf = pd.wide_to_long(df, stubnames=["0", "1"], i="observation_id", j="feature", sep=".", suffix='\w+').\
              stack().unstack(1).reset_index()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM