[英]regex extract value from the string between delimiters
I have a large String and I need to extract String value from it. 我有一个大字符串,我需要从中提取字符串值。 String value is located between delimiters 字符串值位于定界符之间
category = '
and 和
';
This is my regex, but I need to avoid outputing delimiters. 这是我的正则表达式,但是我需要避免输出定界符。
String productCategory = Regex.Match(html, @"category = '(.*?)';").Value;
This is the exampe category = 'Video Cards';
这是示例category = 'Video Cards';
and I need to extract Video Cards
我需要提取Video Cards
What you can use is the lookahead and lookbehind operators, so you end up with something like: 可以使用的是先行和后行运算符,因此最终会得到如下结果:
string pattern = @"(?<=category = ').*(?=';)";
string productCategory = Regex.Match(html, pattern ).Value;
It's also worth mentioning that parsing HTML with regexes is a bad idea . 还值得一提的是, 用正则表达式解析HTML是一个坏主意 。 You should use an HTML parser to parse HTML. 您应该使用HTML解析器来解析HTML。
Have you considered using the MatchObj.Groups
property? 您是否考虑过使用MatchObj.Groups
属性? If you test your current regex at a testing site like Derek Slager's , you'll notice exactly what you want is the first Group. 如果您在Derek Slager's之类的测试站点上测试当前的正则表达式,您会确切地注意到您想要的是第一个Group。 You should simply be able to invoke the first Group and get what you need. 您应该只能够调用第一个组并获得所需的内容。
productCategory.Groups[0].Value
您要提取组:
String productCategory = Regex.Match(html, @"category = '(.*?)';").Groups[1].Value;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.