I have DataFrame like below in Python Pandas ("col1" is data type string):
col1
-----
1234AABY332
857363opx00C*+
9994TyF@@@!
...
And I need to remove all special characters like: ["-", ",", ".", ":", "/", "@", "#", "&", "$", "%", "+", "*", "(", ")", "=", "!", "
", "~", "~ "]
and letters (both large and small) like for example: A, a, b, c and so one...
so as a result I need DataFrame like below:
col1
-----
1234332
85736300
9994
...
How can I do that in Python Pandas ?
我可能会将您的要求表述为删除所有非数字字符:
df["col1"] = df["col1"].str.replace(r'\D+', '', regex=True)
You can also use findall
to extract digit only:
df['col1'] = df['col1'].str.findall(r'(\d)').str.join('')
print(df)
# Output
col1
0 1234332
1 85736300
2 9994
You can append .astype(int)
to convert digits to a number:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.