[英]How to Get Substring of a string having elements in Consecutive Square Brackets using Regex?
I have the following strings: One Rule : Get all consecutive Square bracket strings : for example, 我有以下字符串:一个规则:获取所有连续的方括号字符串:例如,
string 1 : [hello][qwqwe:]sdsdfsdf [note2] string 1:[hello] [qwqwe:] sdsdfsdf [note2]
string 2 : [somethingelse]sdfsdf [note 1] string 2:[somethingelse] sdfsdf [note 1]
string 3 : aasdad[note 3] string 3:aasdad [note 3]
I would like to get the substrings : 我想得到子串:
output 1 : [hello][qwqwe:] 输出1:[你好] [qwqwe:]
output 2 : [somethingelse] 输出2:[somethingelse]
output 3 : 输出3:
If the string doesn't have square brackets, I do not want an output. 如果字符串没有方括号,我不想输出。 If the string has a square bracket delimited string which is not consecutive, it should not match aswell. 如果字符串有一个方括号分隔的字符串,它不是连续的,它也不应该匹配。
I tried using the regex expression 我尝试使用正则表达式
([.*])* ([*])*
But it matches everything between two square brackets. 但它匹配两个方括号之间的所有内容。 If you notice the first string, I do not need the part of the string that violates my rule. 如果您注意到第一个字符串,我不需要违反我的规则的字符串部分。
[...]
s at string start as a single string 方法1:匹配多个连续的[...]
在字符串s开始作为一个字符串 You need to use the following regex: 您需要使用以下正则表达式:
^(\[[^]]*])+
See regex demo 请参阅正则表达式演示
The ^(\\[[^]]*])+
matches: ^(\\[[^]]*])+
匹配:
^
- start of string (in the demo, it matches at line start due to the multiline modifier) ^
- 字符串的开头(在演示中,由于多线修改器,它在行开始时匹配) (\\[[^]]*])+
- captured into Group 1 (you can access all of those values via .Groups[1].Captures
collection) one or more occurrences of... (\\[[^]]*])+
- 捕获到组1(您可以通过.Groups[1].Captures
集合访问所有这些值)一次或多次出现...
\\[
- a literal [
\\[
- 文字[
[^]]*
- zero or more characters other than ]
[^]]*
-比其他零个或多个字符]
]
- a literal ]
. ]
- 文字]
。 C# code demo : C#代码演示 :
var txt = "[hello][qwqwe:]sdsdfsdf [note2]";
var res = Regex.Match(txt, @"^(\[[^]]*])+"); // Run the single search
Console.WriteLine(res.Value); // Display the match
var captures = res.Groups[1].Captures.Cast<Capture>().Select(p => p.Value).ToList();
Console.WriteLine(string.Join(", ", captures)); // Display captures
[...]
s at string start separately 方法2: 分别在字符串开始时匹配多个连续的[...]
s You can use \\G
operator: 你可以使用\\G
运算符:
\G\[[^]]*]
See regex demo 请参阅正则表达式演示
It will match the [...]
substrings at the start of the string and then after each successful match. 它将匹配[...]
在字符串的开始,然后经过每个成功匹配的子串。
Regex explanation : 正则表达式解释 :
\\G
- a zero-width assertion (anchor) matching the location at the beginning of a string, or after each successful match \\G
- 与字符串开头的位置匹配的零宽度断言(锚点),或者在每次成功匹配后匹配 \\[[^]]*]
- a literal [
( \\[
) followed by zero more ( *
) characters other than a ]
, followed by a closing ]
. \\[[^]]*]
-文字[
( \\[
),然后加入更多的零( *
)大于其他字符]
,接着闭合]
。 If you need to return a single string of all [...]
s found at the beginning of the string, you need to concatenate the matches: 如果需要返回所有的单个字符串[...]
作者发现在字符串的开头,你需要连接的比赛:
var txt = "[hello][qwqwe:]sdsdfsdf [note2]";
var res = Regex.Matches(txt, @"\G\[[^]]*]").Cast<Match>().Select(p => p.Value).ToList();
Console.WriteLine(string.Join("", res));
See IDEONE demo 请参阅IDEONE演示
you can use this regular expression. 你可以使用这个正则表达式。
^(\[(.*?)\])*
C# code for matching: 用于匹配的C#代码:
var regex = new Regex(@"^(\[(.*?)\])*");
var inputTexts = new string [] {"[abcd]xyz[pqrst]","abcd[xyz][pqr]","[asdf][abcd][qwer]sds[qwert]" };
foreach (var match in inputTexts.Select(inputText => regex.Match(inputText)))
{
Console.WriteLine(match.Value);
}
//result1 - [abcd]
//result2 -
//result3 - [asdf][abcd][qwer]
You can adjust four things from you original regex to make it work: 1) use non-greedy match .*?
您可以从原始正则表达式中调整四项内容以使其正常工作:1)使用非贪婪匹配.*?
, 2) add ^
to match from beginning of string, 3) escape the square brackets, and 4) change final *
to +
to require at least one group of square brackets: ,2)添加^
以匹配字符串的开头,3)转义方括号,以及4)将final *
更改为+
以至少需要一组方括号:
^(\[.*?\])+
Try this, it works for me with your test strings. 试试这个,它适用于我的测试字符串。
^(\[[^\]]*(\]|\]\[))*
Explanation generated by https://regex101.com/ : https://regex101.com/生成的说明:
1st Capturing group (\[[^\]]*(\]|\]\[))*
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
Note: A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data
\[ matches the character [ literally
[^\]]* match a single character not present in the list below
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
\] matches the character ] literally
2nd Capturing group (\]|\]\[)
1st Alternative: \]
\] matches the character ] literally
2nd Alternative: \]\[
\] matches the character ] literally
\[ matches the character [ literally
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.