简体   繁体   English

正则表达式选择SubString模式

[英]Regex to select a pattern of SubString

What is the syntax for finding and selecting part of a string in Regx C#? 在Regx C#中查找和选择字符串的一部分的语法是什么?

The string could be: 该字符串可以是:

string tdInnerHtml = "<strong> You gained  230 Points </strong> 
                      there is going to be more text and some html code part of this       
                      string <a href=http://google.com>Google it here </a>";

// I want to extract 230 from this string using Regx. 
// The digits (230) vary for each tdInnerHtml. 
// So code would be to look for digits, followed by space, ending with Points

If the space and the </strong> tag are consistent, you can use the following to get the match there, and will work with strings like: " Pints are between 230-240 Points and You gained 230 Points " 如果空格和</strong>标签一致,则可以使用以下命令在此处进行匹配,并且可以与类似的字符串一起使用:“ 品脱在230-240点之间,您获得了230点

        var match = Regex.Match(tdInnerHtml, @"(?<pts>\d+) Points ?</strong>");
        if (match.Success) {
            int points = Convert.ToInt32(match.Groups["pts"].Value);
            Console.WriteLine("Points: {0}", points);
        }

I think your regex pattern might be \\b[0-9]+\\b \\bPoints\\b . 我认为您的正则表达式模式可能是\\b[0-9]+\\b \\bPoints\\b

You might test this at regexpal . 您可以在regexpal上对此进行测试

As long as you're only going for a set of numbers followed by the text Points , Regex can work: 只要您只输入一组数字,然后输入Points ,Regex就可以正常工作:

Match match = Regex.Match(tdInnerHtml, @"(?<![\d-])(\d+) Points");
if (match.Success){
  // fetch result
  String pointsString = match.Groups[1].Value;

  // optional: parse to integer
  Int32 points;
  if (Int32.TryParse(pointsString, out points)){
    // you now have an integer value
  }
}

However, if this is in any way related to where the information resides on the page, formatting its surrounded by, or anything else HTML related--heed others' warnings and use an HTML parser. 但是,如果这与信息在页面上的位置有任何关系,请格式化其周围的格式或其他与HTML相关的内容-请注意其他人的警告并使用HTML解析器。

The regex is very easy, \\d+ Points . 正则表达式非常简单, \\d+ Points Here it is in C#, with a named group capture: 这里是C#,带有命名的组捕获:

        var match = Regex.Match(tdInnerHtml, "(?<pts>\d+) Points");
        if (match.Success) {
            int points = (int)match.Groups["pts"].Value;
            // do something..
        }
string test = "<strong> You gained 230 Points </strong>";
string pattern = @"(\d+)\sPoints";
Regex regex = new Regex(pattern);
Match match = regex.Match(test);
string result = match.Success ? match.Groups[1].Value : "";

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM