I am seeking to loop the columns in a dataframe and when the column name meets a criteria create a new dataframe and/or add it to an existing dataframe. For exmaple - my current dataframe has the following column names:
open high low IVV volume open high low EWH volume open high low INDY volume open high low EWG volume open high low ENZL volume
I want a loop which will find IVV,EWH,INDY,EWG, and ENZL and add them to their own dataframe.
I have tried the following:
Indexlist = ['IVV', 'EWH', 'INDY', 'EWG', 'ENZL']
Attempt to drop the values columns:
for column in data:
print(column)
if column != Indexlist:
data.drop([column], axis=0))
Attempt to del the columns
for column in data:
print(column)
if column != Indexlist:
del data[column]
Attempt to select the columns
data_sample = data[column].isin(Indexlist)
all these methods are throwing errors.
I think need check substrings of columns names by str.contains
with regex - join all values of list by |
for OR
:
data1 = data.loc[:, data.columns.str.contains('|'.join(Indexlist))]
If need select by columns names use subset:
data1 = data[Indexlist]
You can use pd.Index.isin
with pd.DataFrame.loc
for Boolean indexing:
data_sample = data.loc[:, data.columns.isin(Indexlist)]
Or direct indexing, if you know in advance that all list elements exist as columns in your dataframe:
data_sample = data[Indexlist]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.