Say I have a MultiIndex DataFrame like so:
price volume
year product city
2010 A LA 10 7
B SF 7 9
C NY 7 6
LA 18 21
SF 4 8
2011 A LA 13 5
B SF 2 4
C NY 9 3
SF 2 0
I want to do a somewhat complex merge where the first level of the DataFrame index (year) is dropped and the duplicates in the now first level index (product) in the DataFrame get merged according to some custom logic. In this case I would like to be able to set the price column to use the value from the 2010 outer index and the volume column to use the values from the 2011 outer index, but I would like a general solution that can be applied to more columns should they exist.
Final DataFrame would look like this, where the price values are those from the 2010 index and the volume values are those from the 2011 index, where missing values are filled with NaNs.
price volume
product city
A LA 10 5
B SF 7 4
C NY 7 3
LA 18 NaN
SF 4 0
You can select by first level by DataFrame.xs
and then concat
:
df = pd.concat([df.xs(2010)['price'], df.xs(2011)['volume']], axis=1)
Also is possible use loc
:
df = pd.concat([df.loc[2010, 'price'], df.loc[2011, 'volume']], axis=1)
print (df)
price volume
product city
A LA 10 5.0
B SF 7 4.0
C LA 18 NaN
NY 7 3.0
SF 4 0.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.