简体   繁体   English

正则表达式提取字符串中每四个逗号的匹配项

[英]regex extract match for every fourth comma in the string

I am trying to create a regex to get a output for below string, in sets of 4 comma separated values. 我正在尝试创建一个正则表达式,以获取以下字符串的输出,并以4个逗号分隔的值的形式出现。 I have tried something but it only selects each comma separated value. 我尝试了一些方法,但是它仅选择每个逗号分隔的值。 I don't know how to get the desired output. 我不知道如何获得所需的输出。

The regex I tried: 我尝试过的正则表达式:

".*?"(?=,|$)

Data: 数据:

"T","Success","2","2","T","Success","6458960","1","F,"You do not have sufficient credit.","6458962","1"

Desired out: 要求:

"T","Success","2","2"  
"T","Success","6458960","1"  
"F,"You do not have sufficient credit.","6458962","1"

Update: "The F is in double quote too, it was a typo, Sorry!" 更新: “ F也用双引号引起来,这是一个错字,抱歉!”

"T","Success","2","2","T","Success","6458960","1","F","You do not have sufficient credit.","6458962","1" “ T”,“成功”,“ 2”,“ 2”,“ T”,“成功”,“ 6458960”,“ 1”,“ F”,“您没有足够的信用。”,“ 6458962”, “ 1”

You just need String.Split and this: 您只需要String.Split

string[] fields = str.Split(',');
for (int i = 0; i < fields.Length; i += 4)
    Console.WriteLine(string.Join(",", fields.Skip(i).Take(4)));

Output: 输出:

"T","Success","2","2"
"T","Success","6458960","1"
"F,"You do not have sufficient credit.","6458962","1"

This presumes that this is not really csv data. 假定这不是真的csv数据。 Otherwise i would suggest to use a real CSV parser that supports quoting characters . 否则,我建议使用支持引号的真实CSV解析器 But it seems that they are broken anyway( ,"1","F,. , so F is not enclosed in quotes). 但是似乎它们还是被破坏了( ,"1","F,. ,所以F没有用引号引起来)。

You could use following regex, but only if the F is also in enclosed quotes: 您可以使用以下正则表达式,但前提是F也必须用引号引起来:

((?:".+",){3}(?:".+"))

This results in: 结果是:

MATCH 1 1. [0-21] "T","Success","2","2" 匹配1 1. [0-21] "T","Success","2","2"

MATCH 2 1. [22-49] "T","Success","6458960","1" 匹配2 1. [22-49] "T","Success","6458960","1"

MATCH 3 1. [50-104] "F","You do not have sufficient credit.","6458962","1" 匹配3 1. [50-104] "F","You do not have sufficient credit.","6458962","1"

Regex Demo 正则表达式演示

If the data is really in this inconsistent form, you have to parse it manually or modify the regex with an or condition. 如果数据的格式确实不一致,则必须手动解析或使用or条件修改正则表达式。

((?:"[^"]*"|[^,"])*(?:,(?:"[^"]*"|[^,"])*){3}),?
  1. (?:"[^"]*"|[^,"])* will match a values between commas, optionally quoted. (?:"[^"]*"|[^,"])*将匹配逗号之间的值(可选,用引号引起来)。 Quotes are escaped as "" . 引号被转义为""

  2. (X(?:,X){3}),? where X is pattern #1, will match a sequence of four comma-separated values, and an optional trailing comma. 其中X是模式#1,它将匹配四个逗号分隔值和一个可选尾随逗号的序列。 The comma is necessary to correctly match blank values ( ,,foo, ). 逗号必须正确匹配空白值( ,,foo, )。

If the values are guaranteed to always have quotes, you can remove |[^,"] and ,? . 如果保证值始终带有引号,则可以删除|[^,"],?

You can try this regex,but the regex has a problem:you need to trim then last Comma ,and even there are much more Comma in string, not multiple of 4 Comma ,it can also works fine 您可以尝试使用此正则表达式,但是正则表达式有一个问题:您需要修剪最后一个Comma ,甚至字符串中还有更多的Comma ,而不是4个Comma倍数,它也可以正常工作

string patten = @"(?<=^(?:(?:[^,]*,){4})+)";
string text = @"""T"",""Success"",""2"",""2"",""T"",""Success"",""6458960"",""1"",""F,""You do not have sufficient credit."",""6458962"",""1""";
foreach (var tmp in Regex.Split(text, patten))
{
    Console.WriteLine(tmp.TrimEnd(','));
}

I would avoid regex unless you really need it, generally they can be harder to understand. 我会避免使用正则表达式,除非您确实需要它,否则通常很难理解。

For fun here is a Linq solution: 有趣的是,这里是一个Linq解决方案:

var data = @"""T"",""Success"",""2"",""2"",""T"",""Success"",""6458960"",""1"",""F,""You do not have sufficient credit."",""6458962"",""1""";

var res = data.Split(',')
            .Select((x ,i) => new { Pos = i / 4, Val = x })
            .GroupBy(x => x.Pos)
            .Select(g => string.Join(",", g.Select(x => x.Val)));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM