I am learning Pandas and trying to understand slicing. Everything makes sense expect when I try to slice using column names. My data frame looks like this:
area pop
California 423967 38332521
Florida 170312 19552860
Illinois 149995 12882135
New York 141297 19651127
Texas 695662 26448193
and when I do data['area':'pop']
I expected both columns to show since I am using explicit index and both the start and end of the slice should be inclusive, but the result is an empty dataframe.
I also get an empty dataframe for data['area':]
. Why is this different from slicing with explicit indexes elsewhere?
According to documentation
With DataFrame, slicing inside of [] slices the rows . This is provided largely as a convenience since it is such a common operation.
You get an empty DataFrame because your index contains strings and it can't find values 'area' and 'pop' there. Here what you get in case of numeric index
>> data.reset_index()['area':'pop']
TypeError: cannot do slice indexing on <class 'pandas.core.indexes.range.RangeIndex'> with these indexers [area] of <class 'str'>
What you want instead is
>> data.loc[:, 'area':'pop']
If you want to get the 2 columns use:
import pandas as pd
#data = pd.read_csv('data.csv', header = True)
all = data[['area','pop']]
So you can pass a list of columns to [] to select columns in that order.
Similarily, to get only the area column use:
area = df[['area']]
Now, if you want to get the values of the columns use:
all = data[['area','pop']].values
area = df[['area']].values
The all
and area
are going to be numpy arrays.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.