I have the following data frame with multiple headers:
Datetime Value
id a b c d e
0 2017-01-01 00:00:00 0.774016 1.588788 270.06055 268.9109 93060.31
1 2017-01-01 00:10:00 0.774016 1.588788 270.06055 268.9109 93060.31
2 2017-01-01 00:20:00 0.774016 1.588788 270.06055 268.9109 93060.31
3 2017-01-01 00:30:00 0.774016 1.588788 270.06055 268.9109 93060.31
4 2017-01-01 00:40:00 0.774016 1.588788 270.06055 268.9109 93060.31
When I pass from multiple headers to a single header, at some point the column names are swapped and I don't know how to fix it.
cols = ["a","b","c","d","e"]
df.columns = [col[1] if col[0] == '' else col[0] for col in df.columns]
cols.insert(0,"Datetime")
df.columns = cols
This gives me swapped column names:
Datetime a b d e c
0 2017-01-01 00:00:00 0.774016 1.588788 270.06055 268.9109 93060.31
1 2017-01-01 00:10:00 0.774016 1.588788 270.06055 268.9109 93060.31
2 2017-01-01 00:20:00 0.774016 1.588788 270.06055 268.9109 93060.31
3 2017-01-01 00:30:00 0.774016 1.588788 270.06055 268.9109 93060.31
4 2017-01-01 00:40:00 0.774016 1.588788 270.06055 268.9109 93060.31
How can I fix it?
Update:
{('Datetime', ''): {0: Timestamp('2017-01-01 00:00:00'),
1: Timestamp('2017-01-01 00:10:00'),
2: Timestamp('2017-01-01 00:20:00'),
3: Timestamp('2017-01-01 00:30:00'),
4: Timestamp('2017-01-01 00:40:00')},
('Value', 'a'): {0: 0.774016,
1: 0.774016,
2: 0.774016,
3: 0.774016,
4: 0.774016},
('Value', 'b'): {0: 1.588788,
1: 1.588788,
2: 1.588788,
3: 1.588788,
4: 1.588788},
('Value', 'c'): {0: 270.06055,
1: 270.06055,
2: 270.06055,
3: 270.06055,
4: 270.06055},
('Value', 'd'): {0: 268.9109,
1: 268.9109,
2: 268.9109,
3: 268.9109,
4: 268.9109},
('Value', 'e'): {0: 93060.31,
1: 93060.31,
2: 93060.31,
3: 93060.31,
4: 93060.31}}
Bruteforce approach
>>> pd.concat([df[['Datetime']].droplevel(1, axis=1), df["Value"]], axis=1)
Datetime a b c d e
id
0 2017-01-01 00:00:00 0.774016 1.588788 270.06055 268.9109 93060.31
1 2017-01-01 00:10:00 0.774016 1.588788 270.06055 268.9109 93060.31
2 2017-01-01 00:20:00 0.774016 1.588788 270.06055 268.9109 93060.31
3 2017-01-01 00:30:00 0.774016 1.588788 270.06055 268.9109 93060.31
4 2017-01-01 00:40:00 0.774016 1.588788 270.06055 268.9109 93060.31
Try with set_index
+ droplevel
+ reset_index
:
df.set_index('Datetime', append=True).droplevel(0, 1).reset_index('Datetime')
Datetime a b c d e
id
0 2017-01-01 00:00:00 0.774016 1.588788 270.06055 268.9109 93060.31
1 2017-01-01 00:10:00 0.774016 1.588788 270.06055 268.9109 93060.31
2 2017-01-01 00:20:00 0.774016 1.588788 270.06055 268.9109 93060.31
3 2017-01-01 00:30:00 0.774016 1.588788 270.06055 268.9109 93060.31
4 2017-01-01 00:40:00 0.774016 1.588788 270.06055 268.9109 93060.31
Also to fix your implementation, don't insert into the list and misalign the DataFrame just do:
df.columns = [col[1] if col[1] else col[0] for col in df.columns]
Datetime a b c d e
id
0 2017-01-01 00:00:00 0.774016 1.588788 270.06055 268.9109 93060.31
1 2017-01-01 00:10:00 0.774016 1.588788 270.06055 268.9109 93060.31
2 2017-01-01 00:20:00 0.774016 1.588788 270.06055 268.9109 93060.31
3 2017-01-01 00:30:00 0.774016 1.588788 270.06055 268.9109 93060.31
4 2017-01-01 00:40:00 0.774016 1.588788 270.06055 268.9109 93060.31
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.