I have a dataframe with column names, and I want to find the one that contains a certain value. I'm searching for the value in the column 'segment' in column names like '...._segment'. I want the column name to be returned as a string or a variable, so I access the column later with df['name'] or df[name] as normal. enter image description here
I don't know whether you want to get the column names which contain the string you want or the columns name of the columns which have at least one value that contains the string you want.
if the dataframe is:
In [1]: import pandas as pd
...: df = pd.DataFrame({'a_1': ['b_1', 'b_2'], 'b_1': ['a_1', 'a_2']})
In [2]: df
Out[2]:
a_1 b_1
0 b_1 a_1
1 b_2 a_2
for the first case, if you want to find all the column name that match a_*
:
In [3]: import re
In [4]: columns = [col for col in df.columns if isinstance(col, str) and re.match('a_.*', col)]
In [5]: columns
Out[5]: ['a_1']
for the second case, if you want to find all the columns in which there is at least one value that match a_.*
:
In [6]: columns = [col for col, ser in df.iteritems() if ser.str.match('a_.*').any()]
In [7]: columns
Out[7]: ['b_1']
in which:
df.iteritems
: return a iterator of (column name, column values (series)) pairs.
Series.any
: return True
if any value in the series is True
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.