简体   繁体   中英

How do I get the information in the tag into C# and HTMLAgilityPack?

I want to get some info without the C# HTMLAgilityPack tag. (Example: <(a) href = "https://hashcode.co.kr"description = ""...>) I want to get the href value.)

How do I do it?

The HTML agility pack has quite a lot of knowledge that other solutions will lack, for example, if you do this with a regular expression it may trip over some of the oddities of HTML.

However, if you want to do it this way you can use the expression: href="(.*)"

Notes...

  1. This won't work if you have href = "url"
  2. This won't work if single-quotes are being used, ie href='url'
  3. This won't work for a number of other possible HTML variations, no quotes, tabs rather than spaces, missing spaces, etc

Here's a C# example:

using System;
using System.Text.RegularExpressions;

class Program {
    static void Main(string[] args) {
     string pattern = @"href=""(.*)"" ";
     string input = "An extraordinary day <a href=\"https://hashcode.co.kr\" description=\"example\">dawns</a> with each new day.";
     Match m = Regex.Match(input, pattern, RegexOptions.IgnoreCase);
     if (m.Success)
         Console.WriteLine("Found '{0}' at position {1}.", m.Value, m.Index);
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM