简体   繁体   中英

How to remove all special characters and letters from column in DataFrame in Python Pandas?

I have DataFrame like below in Python Pandas ("col1" is data type string):

col1
-----
1234AABY332
857363opx00C*+
9994TyF@@@!
...

And I need to remove all special characters like: ["-", ",", ".", ":", "/", "@", "#", "&", "$", "%", "+", "*", "(", ")", "=", "!", " ", "~", "~ "] and letters (both large and small) like for example: A, a, b, c and so one...

so as a result I need DataFrame like below:

col1
-----
1234332
85736300
9994
...

How can I do that in Python Pandas ?

我可能会将您的要求表述为删除所有非数字字符:

df["col1"] = df["col1"].str.replace(r'\D+', '', regex=True)

You can also use findall to extract digit only:

df['col1'] = df['col1'].str.findall(r'(\d)').str.join('')
print(df)

# Output
       col1
0   1234332
1  85736300
2      9994

You can append .astype(int) to convert digits to a number:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM