[英]Extracting data of certain category from string (python)
我有一个示例字符串: 'Last year's Fortune rank: No.3 2016 revenue $215.6 billion One-year Revenue Change: -7.7%'
我想从此类字符串中提取特定信息,以将它们放入python DataFrame中的某些类别下,例如:
Last year's Fortune rank: 3 2016 revenue ($B): 215.6 One-year revenue change: -7.7%
有没有一种方法可以搜索字符串中的某些文本并在找到后返回下一个单词?
那这样的东西呢?
s = "Last year's Fortune rank: No.3 2016 revenue $215.6 billion One-year Revenue Change: -7.7%"
import re
expression = re.compile(r"Last year's Fortune rank: No.(?P<rank>\d+) +2016 revenue \$(?P<revenue>[.0-9]+) billion One-year Revenue Change: (?P<revchange>[-.0-9]+)%")
m = expression.match(s)
print(m.groupdict())
哪个输出:
{'rank': '3', 'revenue': '215.6', 'revchange': '-7.7'}
当然,您可以使用字典进行任何操作
这不是很干净,但是可以完成工作:
s = 'Last year''s Fortune rank: No.3 2016 revenue $215.6 billion One-year Revenue Change: -7.7%'
print('Last year''s Fortune rank:', s.split('No.')[1].split()[0])
print('2016 revenue ($B):', s.split('$')[1].split()[0])
print('One-year revenue change:', s.split(':')[-1])
输出:
Last years Fortune rank: 3
2016 revenue ($B): 215.6
One-year revenue change: -7.7%
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.