[英]Extract number from alpha-numeric column pandas
Input df输入 df
Code Value
USH0001108421891 -9999
USH0001108421892 -9999 X3
USH0001108421893 -77EX3
USH0001108421894 483EQ3
USH0001108421895 325EX3
USH0001108421896 297ES3
As can be seen from the example, the column Value
has both strings and integers.从示例中可以看出,
Value
列既有字符串又有整数。 But I want the only the first set of integers before the alphabets.但我只想要字母表之前的第一组整数。
Expected df预期 df
Code Value
USH0001108421891 -9999
USH0001108421892 -9999
USH0001108421893 -77
USH0001108421894 483
USH0001108421895 325
USH0001108421896 297
I tried this, but it returned an error.我试过这个,但它返回了一个错误。
df1['Value'] = df1['Value'].astype(int)
ValueError: invalid literal for int() with base 10: '-77EX3'
You can use .str.extract
with regex pattern
containing capturing group:您可以将
.str.extract
与包含捕获组的regex pattern
一起使用:
df['Value'] = df['Value'].str.extract(r'^(-?\d+)', expand=False).astype(int)
Code Value
0 USH0001108421891 -9999
1 USH0001108421892 -9999
2 USH0001108421893 -77
3 USH0001108421894 483
4 USH0001108421895 325
5 USH0001108421896 297
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.