I'm trying to select multiple columns from a pandas DataFrame but am having trouble doing so. Suppose I have the following DataFrame:
import pandas as pd
import numpy as np
cols = ['test','one','two','three','four','five','six','seven','eight','nine','ten']
df = pd.DataFrame(np.random.rand(10,11).round(2),columns=cols)
I want to select columns test
, two
, four
, five
, six
, seven
, eight
I know that if I want to select individual columns,
df[['test','two']]
and if I want to select consecutive columns,
df.loc[:,'four':'eight']
work just fine but how to I combine the two concisely?
I realize that for this specific example, writing
df[['test', 'two', 'four', 'five', 'six', 'seven', 'eight']]
works too but I want to know if there is a way to make use of the fact that most of the columns are consecutive here to save some time writing them all.
np.r_
as @Pooja suggested but with get_loc
and get_indexer
for label based slicing:
a = ['test','two']
b = ['four','eight']
idx= np.r_[df.columns.get_indexer(a),df.columns.get_loc(b[0]):df.columns.get_loc(b[1])+1]
print(df.iloc[:,idx])
test two four five six seven eight
0 0.11 0.91 0.13 0.99 0.17 0.56 0.21
1 0.70 0.94 0.72 0.48 0.53 0.99 0.27
2 0.37 0.03 0.81 0.18 0.47 0.94 0.77
3 0.13 0.69 0.16 0.80 0.02 0.42 0.48
4 0.79 0.91 0.97 0.83 0.20 0.32 0.58
5 0.12 0.86 0.44 0.01 0.71 0.65 0.03
6 0.77 0.31 0.21 0.73 0.70 0.95 0.11
7 0.09 0.91 0.45 0.35 0.91 0.21 0.92
8 0.28 0.32 0.73 0.93 0.97 0.03 0.93
9 0.55 0.77 0.02 0.18 0.65 0.50 0.85
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.