简体   繁体   English

正则表达式:链接多个负向后查找和负向前查找

[英]REGEX: Chain multiple negative lookbehind and negative lookahead

Regexes are just headaches. 正则表达式只是头痛。 I want to chain two negative lookbehind and negative lookahead REGEXes together. 我想将两个负向后查找和负向后查找正则表达式链接在一起。

First one: 第一:

re.sub(r'(?<!([0-1\b][0-9]|[2][0-3])):(?!([0-5][0-9])((?i)(am)|(pm)|(a.m)|(p.m)|(a.m.)|(p.m.))?\b)',':\n',s)

Second one: 第二个:

re.sub(r'(?<!([ps][tp])):(?!([\/][\/]))',':\n',s)

They both work separately, and add \\n if they are not true. 它们都可以单独工作,如果不正确,则添加\\n One is for time and the other is for URLs. 一个是时间,另一个是URL。 How would I add them together so that \\n is added right after a colon if the colon doesn't belong to a URL nor time. 如果冒号不属于URL和时间,我将如何将它们加在一起,以便在冒号后立即添加\\n

This was the first part of my question: How to split string with colons but not if it is a time? 这是我的问题的第一部分: 如何用冒号分割字符串,但是如果不是,怎么办?

Ended up going the long way and using sub to mend the broken URLs that were broken apart by the previous negative lookbehind and negative lookahead regex. 最终走了很长一段路,并使用sub来修补被先前的负向后查找和负向超前正则表达式分解的URL。 Ugh. 啊。

s = re.sub(r'(?<!([0-1\b][0-9]|[2][0-3])):(?!([0-5][0-9])((?i)(am)|(pm)|(a.m)|(p.m)|(a.m.)|(p.m.))?\b)',':\n',s,flags=re.IGNORECASE)
reg = re.compile(re.escape('http:\n//'), re.IGNORECASE)
reg1 = re.compile(re.escape('https:\n//'), re.IGNORECASE)
reg.sub('http://', s)
reg.sub('https://', s)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM