[英]Extracting name and number from a string
我正在尝试提取“消息文本”中的内容,尤其是名称(在单词“ Admitted”之后)和卡号(在括号内),然后将结果放入新列。 实现此目标的最佳方法是什么? 我试过了
access_file['Name']=access_file['Message Text'].str.extract('(.*?)')
但结果列为空白。
谢谢,
Message Type Server Date/Time Message Text Message Date/Time
0 Card Admitted 7/25/2018 8:10 Admitted 'Santos, Samuel' (Card: 203532) at '2nd Flr Check Rm 02-19' (IN). 7/25/2018 8:10
1 Card Admitted 7/25/2018 9:10 Admitted 'Zhu, Jin Chang' (Card: 203929) at '2nd Flr Check Rm 02-19' (IN). 7/25/2018 9:10
2 Card Admitted 7/25/2018 9:34 Admitted 'Zhu, Jin Chang' (Card: 203929) at '2nd Flr Check Rm 02-19' (IN). 7/25/2018 9:34
3 Card Admitted 7/25/2018 9:42 Admitted 'Klein, Erwin' (Card: 511268) at '2nd Flr Check Rm 02-19' (IN). 7/25/2018 9:41
4 Card Admitted 7/25/2018 10:29 Admitted 'Tesis, Olga' (Card: 203047) at '2nd Flr Check Rm 02-19' (IN). 7/25/2018 10:29
您可以尝试以下模式:
pattern = "Admitted\s+\'(?P<name>.*)\'.*\(Card\D*(?P<card_number>\d+)\)"
df['Message Text'].str.extract(pattern)
输出:
name card_number
0 Santos, Samuel 203532
1 Zhu, Jin Chang 203929
2 Zhu, Jin Chang 203929
3 Klein, Erwin 511268
4 Tesis, Olga 203047
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.