Selecting non-consecutive and consecutive columns from a pandas dataframe

Question

I'm trying to select multiple columns from a pandas DataFrame but am having trouble doing so. Suppose I have the following DataFrame:

import pandas as pd
import numpy as np

cols = ['test','one','two','three','four','five','six','seven','eight','nine','ten']
df = pd.DataFrame(np.random.rand(10,11).round(2),columns=cols)

I want to select columns test , two , four , five , six , seven , eight

I know that if I want to select individual columns,

df[['test','two']]

and if I want to select consecutive columns,

df.loc[:,'four':'eight']

work just fine but how to I combine the two concisely?

I realize that for this specific example, writing

df[['test', 'two', 'four', 'five', 'six', 'seven', 'eight']]

works too but I want to know if there is a way to make use of the fact that most of the columns are consecutive here to save some time writing them all.

Answer 1

np.r_ as @Pooja suggested but with get_loc and get_indexer for label based slicing:

a = ['test','two']
b = ['four','eight']
idx= np.r_[df.columns.get_indexer(a),df.columns.get_loc(b[0]):df.columns.get_loc(b[1])+1]
print(df.iloc[:,idx])

   test   two  four  five   six  seven  eight
0  0.11  0.91  0.13  0.99  0.17   0.56   0.21
1  0.70  0.94  0.72  0.48  0.53   0.99   0.27
2  0.37  0.03  0.81  0.18  0.47   0.94   0.77
3  0.13  0.69  0.16  0.80  0.02   0.42   0.48
4  0.79  0.91  0.97  0.83  0.20   0.32   0.58
5  0.12  0.86  0.44  0.01  0.71   0.65   0.03
6  0.77  0.31  0.21  0.73  0.70   0.95   0.11
7  0.09  0.91  0.45  0.35  0.91   0.21   0.92
8  0.28  0.32  0.73  0.93  0.97   0.03   0.93
9  0.55  0.77  0.02  0.18  0.65   0.50   0.85

Selecting non-consecutive and consecutive columns from a pandas dataframe

Question

1 answers

solution1
3 ACCPTED 2020-07-04 19:47:44

Selecting non-consecutive and consecutive columns from a pandas dataframe

Question

1 answers

solution1 3 ACCPTED 2020-07-04 19:47:44

solution1
3 ACCPTED 2020-07-04 19:47:44