简体   繁体   中英

How can I strip off all non-numeric characters in a Pandas Series

I have a Pandas DataFrame. And I am interested in getting a particular column with only numeric characters.

For example, the column contains rows like this:

4'> delay trip
4/
4'>book flight 'trip
34
4"> book flight delay
4"

How can I strip off all non-numeric characters and have just numeric characters like this:

4
4
4
[3,4]
4
4

You have 2 different problems here:

  • first is to extract digits from the column cells
  • second is to make a list if you have more than one digit

Just chain both operations:

df[col].str.findall(r'\d').apply(lambda x: x[0] if len(x) == 1 else '' if len(x) == 0 else x)

With you example it gives:

0         4
1         4
2         4
3    [3, 4]
4         4
5         4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM