Looping Pandas Column Names To Create New Data Frame

Question

I am seeking to loop the columns in a dataframe and when the column name meets a criteria create a new dataframe and/or add it to an existing dataframe. For exmaple - my current dataframe has the following column names:

open high low IVV volume open high low EWH volume open high low INDY volume open high low EWG volume open high low ENZL volume

I want a loop which will find IVV,EWH,INDY,EWG, and ENZL and add them to their own dataframe.

I have tried the following:

Indexlist = ['IVV', 'EWH', 'INDY', 'EWG', 'ENZL']

Attempt to drop the values columns:

for column in data:
    print(column)
    if column != Indexlist:
        data.drop([column], axis=0))

Attempt to del the columns

for column in data:
    print(column)
    if column != Indexlist:
        del data[column]

Attempt to select the columns

data_sample = data[column].isin(Indexlist)

all these methods are throwing errors.

Answer 1

I think need check substrings of columns names by str.contains with regex - join all values of list by | for OR :

data1 = data.loc[:, data.columns.str.contains('|'.join(Indexlist))]

If need select by columns names use subset:

data1 = data[Indexlist]

Answer 2

You can use pd.Index.isin with pd.DataFrame.loc for Boolean indexing:

data_sample = data.loc[:, data.columns.isin(Indexlist)]

Or direct indexing, if you know in advance that all list elements exist as columns in your dataframe:

data_sample = data[Indexlist]

Looping Pandas Column Names To Create New Data Frame

Question

2 answers

solution1
0 2018-06-24 20:06:31

solution2
0 2018-06-24 20:11:14

Looping Pandas Column Names To Create New Data Frame

Question

2 answers

solution1 0 2018-06-24 20:06:31

solution2 0 2018-06-24 20:11:14

solution1
0 2018-06-24 20:06:31

solution2
0 2018-06-24 20:11:14