简体   繁体   中英

Pandas Dataframe filter rows by only one column

I want to filter a dataframe by only keeping the rows that conform with a regex pattern in a given column. The example in the documentation only filters by looking for that regex in every column in the dataframe ( documentation to filter )

So how can i change the following example

df.filter(regex='^[\d]*', axis=0)

to something like this: (Which only looks for the regex in the specified column)

df.filter(column='column_name', regex='^[\d]*', axis=0)

使用从给定列和正则表达式模式制作的布尔掩码过滤DataFrame,如下所示: df[df.column_name.str.contains('^[\\d]*', regex=True)]

Use the vectorized string method contains() or match() - see Testing for Strings that Match or Contain a Pattern :

df[df.column_name.str.contains('^\d+')]

or

df[df.column_name.str.match('\d+')]    # Matches only start of the string

Note that I removed superfluous brackets ( [] ), and replaced * with + , because the \\d* will always match as it matches a zero occurrences, too (so called a zero-length match .)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM