简体   繁体   English

C#查找带有模式的大字符串中的所有子字符串

[英]C# Find all substring in large string with pattern

Consider I have the following string: 考虑我有以下字符串:

jkasdlue as 12&sdaj__3982[source=saj_/29]sj*2&7^;'asj[source=-js/.2]_jsld+=[source=283]

I'd like to get output of array of string below: 我想在下面获取字符串数组的输出:

{"saj_/29","-js/.2","283"}

Any help would be appreciated. 任何帮助,将不胜感激。 Thanks. 谢谢。

UPDATE 更新

Okay. 好的。 Pardon me if my question is too broad or seems no effort from me. 如果我的问题过于笼统或似乎没有我的努力,请原谅我。 I need to refine the pattern which should accept only alphanumeric characters, "-", "_", ".", "/", ":", " ". 我需要优化仅接受字母数字字符“-”,“ _”,“。”,“ /”,“:”,“”的模式。 Follow someone suggestion below to use regex. 按照下面的建议使用正则表达式。

For now this regex seems to work: 目前,此正则表达式似乎可以正常工作:

\[source=[A-Za-z0-9-_ \\\/.:]+\]

Next step substring each match to eliminate the open tag "[source=" and the close tag "]" 下一步将每个匹配的子字符串消除,以消除打开标记“ [source =“和关闭标记“]”

Any better idea to reduce the process? 有什么更好的主意可以减少该过程吗?

You just need \\[source=([A-Za-z0-9-_ \\\\/.:]+)\\] (if you do not need to match a backslash. remove \\\\ ) and access the value without last ] and initial [source= using match.Groups[1].Value . 您只需要\\[source=([A-Za-z0-9-_ \\\\/.:]+)\\] (如果不需要匹配反斜杠。请删除\\\\ )并访问没有last的值]和初始[source=使用match.Groups[1].Value

var res = Regex.Matches(str, @"\[source=([A-Za-z0-9-_ \\/.:]+)\]").Cast<Match>().Select(match => match.Groups[1].Value).ToList();

See C# demo : 参见C#演示

var str  = "jkasdlue as 12&sdaj__3982[source=saj_/29]sj*2&7^;'asj[source=-js/.2]_jsld+=[source=283]";
var res = Regex.Matches(str, @"\[source=([A-Za-z0-9-_ \\/.:]+)\]").Cast<Match>().Select(match => match.Groups[1].Value).ToList();
Console.WriteLine(String.Join("\n", res));

Result: 结果:

saj_/29
-js/.2
283

Note that it is also possible to get the results using look-arounds, but as they are "expensive", less efficient and just not necessary here, I would not advise to use it. 请注意,也可以使用环视来获得结果,但是由于它们“昂贵”,效率低下并且在这里不是必需的,因此我不建议您使用它。 Here is link to a regex demo : 这是正则表达式演示的链接:

(?<=\[source=)[A-Za-z0-9-_ \\/.:]+(?=\])
^^^lookbehind^                    ^^^^^^ - lookahead          

And in C#: 在C#中:

var res = Regex.Matches(str, @"(?<=\[source=)[A-Za-z0-9-_ \\/.:]+(?=\])").Cast<Match>().Select(match => match.Value).ToList();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM