[英]How do I split at a Regex and everything after it
I have a Regex that successfully captures a date.我有一个成功捕获日期的正则表达式。 Now I wanna split it so that everything after the date is also taken.
现在我想拆分它,以便日期之后的所有内容也都被采用。 I have this now:
我现在有这个:
data = "11/07/2020 apple\n juice\n 11/07/2020 pear"
dateRegex = re.compile('([0-9]+\/[0-9]+\/[0-9]+)')
splittedData = re.split(dateRegex, data)
# Splits into: ['11/07/2020', ' apple\n juice\n ', '11/07/2020' ' pear']
# Desired: ['11/07/2020 apple\n juice\n ', '11/07/2020 pear']
Thanks in advance.提前致谢。
You can use您可以使用
(?<=\s)(?=\d+/\d+/\d+)
NOTE : Works with re.split
in Python 3.7 and newer where the support of splitting on a pattern that could match an empty string was added.注意:在 Python 3.7 和更新版本中与
re.split
一起使用,其中添加了对可以匹配空字符串的模式进行拆分的支持。
Details细节
(?<=\\s)
- a location immediately preceded with a whitespace (?<=\\s)
- 紧跟在空格之前的位置(?=\\d+/\\d+/\\d+)
- a location immediately followed with 1+ digits and two occurrences of /
+ one or more digits. (?=\\d+/\\d+/\\d+)
- 紧跟 1+ 位数字和两次/
+ 一位或多位数字的位置。 See the Python demo :请参阅Python 演示:
import re
data = "11/07/2020 apple\n juice\n 11/07/2020 pear"
dateRegex = re.compile('(?<=\s)(?=\d+/\d+/\d+)')
splittedData = re.split(dateRegex, data)
print(splittedData)
# => ['11/07/2020 apple\n juice\n ', '11/07/2020 pear']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.