简体   繁体   中英

How to use Html Agility Pack for HTML validations

I am using HTML Agility Pack for validating my html. Below is what I am using,

public class MarkupErrors
{
    public string ErrorCode { get; set; }
    public string ErrorReason { get; set; }
}

public static List<MarkupErrors> IsMarkupValid(string html)
{
    var document = new HtmlAgilityPack.HtmlDocument();
    document.OptionFixNestedTags = true;
    document.LoadHtml(html);

    var parserErrors = new List<MarkupErrors>();
    foreach(var error in document.ParseErrors)
    {
        parserErrors.Add(new MarkupErrors
                             {
                                 ErrorCode = error.Code.ToString(),
                                 ErrorReason = error.Reason
                             });
    }

    return parserErrors;
}

So say my input is something like the one shown below :

<h1>Test</h1> 
Hello World</h2> 
<h3>Missing close h3 tag

So my current function return a list of following errors

- Start tag <h2> was not found
- End tag </h3> was not found

which is fine...

My problem is that I want the entire html to be valid, that is with a proper <head> and <body> tags, because this html will later be available for preview, download as .html files.

So I was wondering if I could check for this using HTML Agility Pack ?

Any ideas or other options will be appreciated. Thanks

You can check there is a HEAD element or a BODY element under an HTML element like this for example:

bool hasHead = doc.DocumentNode.SelectSingleNode("html/head") != null;
bool hasBody = doc.DocumentNode.SelectSingleNode("html/body") != null;

These would fail if there is no HTML element, or if there is no BODY element under the HTML element.

Note I don't use this kind of XPATH expression "//head" because it would give a result even if the head was not directly under the HTML element.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM