I have a Pandas table and need to extract the stock code '00981', '00823' from text stored in a column. The code is in the (00000) format. The code would be located at different location in the text summary. Please advice.
News
1 example(00981)example example example。
2 example example example (00823)text text text
desired output:
Code column
981
823
s = TABLE['News'].str.find('(')
e = s + 5
c = TABLE['News'].str[s:e]
TABLE["Code"] = c
This will find all occurrences of 5 digits surrounded by parentheses:
import re
x = re.findall('\(\d{5}\)', my_string)
This works for me:
print(df)
News
0 1 example(00981)example example example。
1 2 example example example (00823)text text...
-
df['stock_num'] = df['News'].str.extract('(\d{5})').astype(int) print(df) News stock_num 0 1 example(00981)example example example。 981 1 2 example example example (00823)text text... 823
to change the string into a number you can either leverage the .astype()
method or pd.to_numeric(df['stock_number'])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.