简体   繁体   English

我如何在正则表达式及其之后的所有内容中拆分

[英]How do I split at a Regex and everything after it

I have a Regex that successfully captures a date.我有一个成功捕获日期的正则表达式。 Now I wanna split it so that everything after the date is also taken.现在我想拆分它,以便日期之后的所有内容也都被采用。 I have this now:我现在有这个:

data = "11/07/2020 apple\n juice\n 11/07/2020 pear"
dateRegex = re.compile('([0-9]+\/[0-9]+\/[0-9]+)')
splittedData = re.split(dateRegex, data)

# Splits into: ['11/07/2020', ' apple\n juice\n ', '11/07/2020' ' pear']
# Desired:     ['11/07/2020 apple\n juice\n ', '11/07/2020 pear']

Thanks in advance.提前致谢。

You can use您可以使用

(?<=\s)(?=\d+/\d+/\d+)

NOTE : Works with re.split in Python 3.7 and newer where the support of splitting on a pattern that could match an empty string was added.注意:在 Python 3.7 和更新版本中与re.split一起使用,其中添加了对可以匹配空字符串的模式进行拆分的支持。

Details细节

  • (?<=\\s) - a location immediately preceded with a whitespace (?<=\\s) - 紧跟在空格之前的位置
  • (?=\\d+/\\d+/\\d+) - a location immediately followed with 1+ digits and two occurrences of / + one or more digits. (?=\\d+/\\d+/\\d+) - 紧跟 1+ 位数字和两次/ + 一位或多位数字的位置。

See the Python demo :请参阅Python 演示

import re
data = "11/07/2020 apple\n juice\n 11/07/2020 pear"
dateRegex = re.compile('(?<=\s)(?=\d+/\d+/\d+)')
splittedData = re.split(dateRegex, data)
print(splittedData)
# => ['11/07/2020 apple\n juice\n ', '11/07/2020 pear']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在某个字符串后剥离所有内容 - how do I strip off everything after a certain string 如何在 &#39;\n&#39; 上拆分一行并将拆分后的所有内容添加到新行? - How to split a row on ‘\n’ and add everything after the split to a new row? 如何删除某个字符后的所有内容? - How do I remove everything after a certain character? 如何在到达某个项目时拆分一个列表,然后在Python中包含该项目后返回所有内容? - How can I split a list when it reaches a certain item and then return everything after including that item in Python? 如何在 Python 中使用正则表达式按括号拆分列表中的值? - How do I split values in a list by parenthesis using regex in Python? 如何仅在括号内拆分或切片所有内容并仍然循环遍历? - How do I split or slice everything only inside in the bracket and still loop through? 如何让正则表达式在某事之前获取单词,但不获取该单词后面的模式背后的所有内容? - How do I make a regex take the word before something but not grab everything that is behind the pattern that follows the word? Python中正则表达式之后/之前的所有内容 - Everything after/before regex in Python 正则表达式:赛后采取一切措施 - Regex: take everything after match 正则表达式匹配一个单词和之后的一切? - regex to match a word and everything after it?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM