[英]Regex to find exactly X lines
I'm trying to run some regex in python to drop different patterns of text into different files. 我正在尝试在python中运行一些正则表达式,以将不同模式的文本放入不同的文件中。 Turns out, 99+% of all the lines from my source file have a 3-line format like this: 事实证明,源文件中所有行的99%以上具有3行格式,如下所示:
12340987 some other text
some text
some text
But then I've got small likelihood that the pattern will have four lines, like this: 但是然后我很少会看到模式有四行,如下所示:
123456789 Some text
Some text
some text
one extra line of text
I was trying to write a regex to chase down all the 4-line patterns, and started out with this: 我试图编写一个正则表达式来追踪所有的4行模式,并以此开始:
^[0-9]+([\s\S]*?)(?=^[0-9])
How can I build something with a gist like this, but only grab the 4-line pattern? 我该如何用这样的要领来构建东西,但只能抓住4线模式? Thanks for reading, and helping if you can. 感谢您的阅读和帮助。 :) :)
You could try something like this: 您可以尝试这样的事情:
^[0-9]+.+$\s(?:^(?!\d).+$\s?){3}
flags gm
set 标志gm
集
see here https://regex101.com/r/TOoCzF/1 看到这里https://regex101.com/r/TOoCzF/1
Explanation: ^[0-9]+.+$\\s
= Start of line, follows by number and then something, end of line and linebreak 说明: ^[0-9]+.+$\\s
=行首,后跟数字,然后是东西,行尾和换行符
then (?:^(?!\\d).+$\\s?){3}
= 3 times a line that does not start with a number 然后(?:^(?!\\d).+$\\s?){3}
= 3倍不是以数字开头的行
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.