简体   繁体   English

C#正则表达式以匹配多个部分

[英]C# Regex to match multiple section

I have a .txt file with this format 我有这种格式的.txt文件

content-length: 20 内容长度:20

blahblahblah 等等等等等等
-stop- -停-
content-length: 10 内容长度:10

bum 屁股
-step- -步-
content-length: 0 内容长度:0

<---empty space---> <-空的空间->
-step- -步-
content-length: 10 内容长度:10

huba 哈巴
-step- -步-

I use regex to separate the section per content length, which is use step or stop to make it become end of the section. 我使用正则表达式按内容长度分隔该部分,使用步骤或停止使其成为该部分的结尾。 My regex is 我的正则表达式是

((content-length:)\\s(\\d )[\\r\\n]+([\\s\\S]+?)(-stop-|-step-))* (((content-length:)\\ s(\\ d )[\\ r \\ n] +([\\ s \\ S] +?)(-stop- | -step-))*

However, if the content length is zero which means before step or stop there is whitespace, it also capture the next content length section. 但是,如果内容长度为零(这意味着在步进或停止之前存在空白),则它还会捕获下一个内容长度部分。 Any idea to prevent this? 有什么想法可以防止这种情况吗?

I come up with the following regex, not sure if it is what you want: 我想出以下正则表达式,不确定是否是您想要的:

var pattern = @"(content-length:\s\d+(?:[\s\S]*?)?-(?:stop|step)-)";
var input = @"content-length: 20

    blahblahblah
    -stop-
    content-length: 10

    bum
    -step-
    content-length: 0


    -step-
    content-length: 10

    huba
    -step-";
var result = Regex.Split(input, pattern);

Output: 输出:

在此处输入图片说明

Try this 尝试这个

(?:(?:content-length):\s(?<length>\d+)\n+(?<content>.*?)\n*(?:-stop-|-step-))

Demo 演示版

Input: 输入:

content-length: 20

blahblahblah
-stop-
content-length: 10

bum
-step-
content-length: 0


-step-
content-length: 10

huba
-step-      

Output: 输出:

MATCH 1
length  [16-18] `20`
content [20-32] `blahblahblah`
MATCH 2
length  [56-58] `10`
content [60-63] `bum`
MATCH 3
length  [87-88] `0`
2.  [91-91] ``
MATCH 4
length  [114-116]   `10`
content [118-122]   `huba`

试试这个

(?:(?:content-length:))\\s(\\d+)[\\r\\n]+(.*)?[\\r\\n]+(?:-stop-|-step-)

((content-length:)\\s(\\d+)[\\r\\n]+(.*)\\n*(-stop-|-step-)). (((content-length:)\\ s(\\ d +)[\\ r \\ n] +(。*)\\ n *(-stop- | -step-))。 Check out the regex here https://regex101.com/r/wU9uA4/1 在这里查看正则表达式https://regex101.com/r/wU9uA4/1

试试这个代码:

((content-length:)\s(\d)[\r\n]\*([\s\S]\*?)(-stop-|-step-))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM