All,
I have a dataframe that looks like this: df[['date','PRICE']]
df>>
date Price
PX_FIRST PX_LAST
2018-03-05 1.710 -0.511
2018-03-06 1.725 -0.513
2018-03-07 1.745 -0.511
2018-03-08 1.750 -0.512
how can I get a dataframe similar to this? in other words how can I access PX_FIRST and PX_LAST. When I do df[['date','PRICE']]
it dont manage to access individual columns.
date PX_FIRST PX_LAST
2018-03-05 1.710 -0.511
2018-03-06 1.725 -0.513
2018-03-07 1.745 -0.511
2018-03-08 1.750 -0.512
If need select columns under Price
value of first level:
df = df['Price']
Or use DataFrame.xs
:
df = df.xs('Price', axis=1)
print (df)
PX_FIRST PX_LAST
Date
2018-03-05 1.710 -0.511
2018-03-06 1.725 -0.513
2018-03-07 1.745 -0.511
2018-03-08 1.750 -0.512
If need remove top level of MultiIndex
:
df.columns = df.columns.droplevel(0)
But be carefull if more columns with different first level ( Price
, Price1
) and same values in second level:
#create sample data
df = pd.concat([df['Price'], df['Price'] * 0.4], keys=('Price','Price1'), axis=1)
print (df)
Price Price1
PX_FIRST PX_LAST PX_FIRST PX_LAST
Date
2018-03-05 1.710 -0.511 0.684 -0.2044
2018-03-06 1.725 -0.513 0.690 -0.2052
2018-03-07 1.745 -0.511 0.698 -0.2044
2018-03-08 1.750 -0.512 0.700 -0.2048
Remove first level:
df.columns = df.columns.droplevel(0)
print (df)
PX_FIRST PX_LAST PX_FIRST PX_LAST
Date
2018-03-05 1.710 -0.511 0.684 -0.2044
2018-03-06 1.725 -0.513 0.690 -0.2052
2018-03-07 1.745 -0.511 0.698 -0.2044
2018-03-08 1.750 -0.512 0.700 -0.2048
If select column PX_FIRST
it return DataFrame
, because duplicated columns names:
print (df['PX_FIRST'])
PX_FIRST PX_FIRST
Date
2018-03-05 1.710 0.684
2018-03-06 1.725 0.690
2018-03-07 1.745 0.698
2018-03-08 1.750 0.700
If need select by both levels, use tuples:
print (df[('Price', 'PX_FIRST')])
Date
2018-03-05 1.710
2018-03-06 1.725
2018-03-07 1.745
2018-03-08 1.750
Name: (Price, PX_FIRST), dtype: float64
IIUC multiple index
df.loc[:,pd.IndexSlice['Price']]
Out[1108]:
PX_FIRST PX_LAST
Date
2018-03-05 1.710 -0.511
2018-03-06 1.725 -0.513
2018-03-07 1.745 -0.511
2018-03-08 1.750 -0.512
@jezrael You are exactly right when I drop one level I end up with a duplicate column name and it is hard to distinguish columns unless I rename them?
The other challenge in your example below
PX_FIRST PX_FIRST
Date
2018-03-05 1.710 0.684
2018-03-06 1.725 0.690
2018-03-07 1.745 0.698
2018-03-08 1.750 0.700
is that column "Date", "PX_FIRST" and "PX_FIRST" are in different levels so I call df[['Date','PX_FIRST','PX_FIRST']] i get an error "...not in index"
Ideally, id be looking to get
Date PX_FIRST PX_LAST
2018-03-05 1.710 0.684
2018-03-06 1.725 0.690
2018-03-07 1.745 0.698
2018-03-08 1.750 0.700
All column names are on a similar level and have different names
Thanks
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.