简体   繁体   English

从字母数字列 pandas 中提取数字

[英]Extract number from alpha-numeric column pandas

Input df输入 df

Code               Value
USH0001108421891  -9999    
USH0001108421892  -9999 X3 
USH0001108421893  -77EX3 
USH0001108421894   483EQ3 
USH0001108421895   325EX3 
USH0001108421896   297ES3 

As can be seen from the example, the column Value has both strings and integers.从示例中可以看出, Value列既有字符串又有整数。 But I want the only the first set of integers before the alphabets.但我只想要字母表之前的第一组整数。

Expected df预期 df

Code               Value
USH0001108421891  -9999    
USH0001108421892  -9999 
USH0001108421893  -77
USH0001108421894   483
USH0001108421895   325
USH0001108421896   297 

I tried this, but it returned an error.我试过这个,但它返回了一个错误。

df1['Value'] = df1['Value'].astype(int)
ValueError: invalid literal for int() with base 10: '-77EX3'

You can use .str.extract with regex pattern containing capturing group:您可以将.str.extract与包含捕获组的regex pattern一起使用:

df['Value'] = df['Value'].str.extract(r'^(-?\d+)', expand=False).astype(int)

              Code   Value
0  USH0001108421891  -9999
1  USH0001108421892  -9999
2  USH0001108421893    -77
3  USH0001108421894    483
4  USH0001108421895    325
5  USH0001108421896    297

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM