简体   繁体   English

在 C# 中验证 HTML5

[英]Validate HTML5 in C#

We are currently building a green-fields app in C#.我们目前正在用 C# 构建一个绿地应用程序。 We have extensive UI tests which use Selenium Web Driver.我们有大量使用 Selenium Web Driver 的 UI 测试。 These tests ( as well as unit tests ) are run by our CI server.这些测试(以及单元测试)由我们的 CI 服务器运行。

Selenium exposes a .PageSource attribute, and it makes sense (to me) to run that source through a HTML5 validator as another part each UI test. Selenium 公开了一个 .PageSource 属性,(对我而言)通过 HTML5 验证器运行该源作为每个 UI 测试的另一部分是有意义的。

I want to pick up on the same sorts of things that http://validator.w3.org/ picks up on.我想了解与http://validator.w3.org/相同的东西。 As a bonus, I would also like to pick up on a 508 issues.作为奖励,我还想了解 508 问题。

My problem is that I can't find anything that will do this locally and is easy to integrate into my UI tests.. The W3C site exposes a SOAP api, however I don't want to hit their site as part of the CI process.我的问题是我找不到任何可以在本地执行此操作并且易于集成到我的 UI 测试中的任何内容。W3C 站点公开了一个 SOAP api,但是我不想在 CI 过程中访问他们的站点. They also don't appear to support getting SOAP responses back.它们似乎也不支持获取 SOAP 响应。 I would like to avoid installing a full W3C server locally.我想避免在本地安装完整的 W3C 服务器。

The closest thing that I see is this http://www.totalvalidator.com/ , using it would require writing temp files and parsing reports.我看到的最接近的是这个http://www.totalvalidator.com/ ,使用它需要编写临时文件和解析报告。

I thought I'd see if anyone knows of another way before I go down this track.在我走这条路之前,我想我会看看是否有人知道另一种方式。 Preferably a DotNet assembly that I can call.最好是我可以调用的 DotNet 程序集。

c

After spending an entire weekend on this problem, the only solution I can see is a commercial library called CSE HTML Validator在这个问题上花了整个周末后,我能看到的唯一解决方案是一个名为 CSE HTML Validator 的商业库

It is located here http://www.htmlvalidator.com/htmldownload.html它位于这里http://www.htmlvalidator.com/htmldownload.html

I wrote a simple wrapper for it.我为它写了一个简单的包装器。 Here is the code这是代码

using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;

[assembly: CLSCompliant(true)]
namespace HtmlValidator
{

public class Validator
{
    #region Constructors...

    public Validator(string htmlToValidate)
    {
        HtmlToValidate = htmlToValidate;
        HasExecuted = false;
        Errors = new List<ValidationResult>();
        Warnings = new List<ValidationResult>();
        OtherMessages = new List<ValidationResult>();

    }

    #endregion



    #region Properties...
    public IList<ValidationResult> Errors { get; private set; }
    public bool HasExecuted { get; private set; }
    public string HtmlToValidate { get; private set; }
    public IList<ValidationResult> OtherMessages { get; private set; }
    public string ResultsString { get; private set; }
    public string TempFilePath { get; private set; }
    public IList<ValidationResult> Warnings { get; private set; }
    #endregion



    #region Public methods...
    public void ValidateHtmlFile()
    {

        WriteTempFile();

        ExecuteValidator();

        DeleteTempFile();

        ParseResults();

        HasExecuted = true;
    }

    #endregion



    #region Private methods...
    private void DeleteTempFile()
    {
        TempFilePath = Path.GetTempFileName();
        File.Delete(TempFilePath);
    }


    private void ExecuteValidator()
    {
        var psi = new ProcessStartInfo(GetHTMLValidatorPath())
        {
            RedirectStandardInput = false,
            RedirectStandardOutput = true,
            RedirectStandardError = false,
            UseShellExecute = false,
            Arguments = String.Format(@"-e,(stdout),0,16 ""{0}""", TempFilePath)
        };

        var p = new Process
        {
            StartInfo = psi
        };
        p.Start();

        var stdOut = p.StandardOutput;

        ResultsString = stdOut.ReadToEnd();
    }


    private static string GetHTMLValidatorPath()
    {
        return @"C:\Program Files (x86)\HTMLValidator120\cmdlineprocessor.exe";
    }


    private void ParseResults()
    {
        var results = JsonConvert.DeserializeObject<dynamic>(ResultsString);
        IList<InternalValidationResult> messages = results.messages.ToObject<List<InternalValidationResult>>();


        foreach (InternalValidationResult internalValidationResult in messages)
        {
            ValidationResult result = new ValidationResult()
            {
                Message = internalValidationResult.message,
                LineNumber = internalValidationResult.linenumber,
                MessageCategory = internalValidationResult.messagecategory,
                MessageType = internalValidationResult.messagetype,
                CharLocation = internalValidationResult.charlocation
            };

            switch (internalValidationResult.messagetype)
            {
                case "ERROR":
                    Errors.Add(result);
                    break;

                case "WARNING":
                    Warnings.Add(result);
                    break;

                default:
                    OtherMessages.Add(result);
                    break;
            }
        }
    }


    private void WriteTempFile()
    {
        TempFilePath = Path.GetTempFileName();
        StreamWriter streamWriter = File.AppendText(TempFilePath);
        streamWriter.WriteLine(HtmlToValidate);
        streamWriter.Flush();
        streamWriter.Close();
    }
    #endregion
}
}




public class ValidationResult
{
    public string MessageType { get; set; }
    public string MessageCategory { get; set; }
    public string Message { get; set; }
    public int LineNumber { get; set; }
    public int CharLocation { get; set; }


    public override string ToString()
    {
        return String.Format("{0} Line {1} Char {2}:: {3}", this.MessageType, this.LineNumber, this.CharLocation, this.Message);

    }

}


public class InternalValidationResult
{
    /*
     * DA: this class is used as in intermediate store of messages that come back from the underlying validator. The fields must be cased as per the underlying Json object.
     * That is why they are ignored.
     */
    #region Properties...
    [System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Naming", "CA1709:IdentifiersShouldBeCasedCorrectly", MessageId = "charlocation"), System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Naming", "CA1704:IdentifiersShouldBeSpelledCorrectly", MessageId = "charlocation")]
    public int charlocation { get; set; }
    [System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Naming", "CA1709:IdentifiersShouldBeCasedCorrectly", MessageId = "linenumber"), System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Naming", "CA1704:IdentifiersShouldBeSpelledCorrectly", MessageId = "linenumber")]

    public int linenumber { get; set; }
    [System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Naming", "CA1709:IdentifiersShouldBeCasedCorrectly", MessageId = "message"), System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Naming", "CA1704:IdentifiersShouldBeSpelledCorrectly", MessageId = "message")]

    public string message { get; set; }
    [System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Naming", "CA1704:IdentifiersShouldBeSpelledCorrectly", MessageId = "messagecategory"), System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Naming", "CA1709:IdentifiersShouldBeCasedCorrectly", MessageId = "messagecategory")]
    public string messagecategory { get; set; }
    [System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Naming", "CA1709:IdentifiersShouldBeCasedCorrectly", MessageId = "messagetype"), System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Naming", "CA1704:IdentifiersShouldBeSpelledCorrectly", MessageId = "messagetype")]

    public string messagetype { get; set; }
    #endregion
}

Usage/Testing使用/测试

   private const string ValidHtml = "<!DOCType html><html><head></head><body><p>Hello World</p></body></html>";
    private const string BrokenHtml = "<!DOCType html><html><head></head><body><p>Hello World</p></body>";

    [TestMethod]
    public void CanValidHtmlStringReturnNoErrors()
    {
        Validator subject = new Validator(ValidHtml);
        subject.ValidateHtmlFile();
        Assert.IsTrue(subject.HasExecuted);
        Assert.IsTrue(subject.Errors.Count == 0);
    }


    [TestMethod]
    public void CanInvalidHtmlStringReturnErrors()
    {
        Validator subject = new Validator(BrokenHtml);
        subject.ValidateHtmlFile();
        Assert.IsTrue(subject.HasExecuted);
        Assert.IsTrue(subject.Errors.Count > 0);
        Assert.IsTrue(subject.Errors[0].ToString().Contains("ERROR"));
    }

It looks like this link may have what you want: Automated W3C Validation看起来这个链接可能有你想要的: 自动 W3C 验证

You can download a markup validator in the accepted answer and pass your HTML to that.您可以在接受的答案中下载标记验证器并将您的 HTML 传递给它。 Sorry they're not .NET assemblies :/, but you could wrap it in a DLL if you really wanted to.抱歉,它们不是 .NET 程序集:/,但是如果您真的愿意,可以将其包装在 DLL 中。

Also, one of the answers on this question suggests that the W3C service actually exposes a RESTful API, but can return a SOAP response: How might I use the W3C Markup Validator API in my .NET application?此外,这个问题的答案之一表明 W3C 服务实际上公开了一个 RESTful API,但可以返回一个 SOAP 响应: 如何在我的 .NET 应用程序中使用 W3C 标记验证器 API?

The best HTML5 validator, the nu checker , is in Java and hard to interface with from .NET.最好的 HTML5 验证器nu checker是用 Java编写的,很难与 .NET 进行交互。 But libtidy can be written into a C++ dll to be called from managed code.但是可以将libtidy写入 C++ dll 以从托管代码中调用。 The sample program they've posted did a good job for me, with a little adapting.他们发布的示例程序对我来说做得很好,稍作调整。

LibTidy.h: LibTidy.h:

public ref class LibTidy
{
public:
    System::String^ __clrcall Test(System::String^ input);
};

LibTidy.cpp: LibTidy.cpp:

System::String^ __clrcall LibTidy::Test(System::String^ input)
{
    CStringW cstring(input);
  
    const size_t newsizew = (cstring.GetLength() + 1) * 2;
    char* nstringw = new char[newsizew];
    size_t convertedCharsw = 0;
    wcstombs_s(&convertedCharsw, nstringw, newsizew, cstring, _TRUNCATE);

        TidyBuffer errbuf = { 0 };
        int rc = -1;
        Bool ok;

        TidyDoc tdoc = tidyCreate();                     // Initialize "document"
                
        ok = tidyOptSetBool(tdoc, TidyShowInfo, no);
        ok = tidyOptSetBool(tdoc, TidyQuiet, yes);
        ok = tidyOptSetBool(tdoc, TidyEmacs, yes);
        if (ok)
            rc = tidySetErrorBuffer(tdoc, &errbuf);      // Capture diagnostics
        if (rc >= 0)
            rc = tidyParseString(tdoc, nstringw);           // Parse the input
        if (rc >= 0)
            rc = tidyCleanAndRepair(tdoc);               // Tidy it up!
        if (rc >= 0)
            rc = tidyRunDiagnostics(tdoc);               // Kvetch
        char* outputBytes = (char*)errbuf.bp;
    
        if (errbuf.allocator != NULL) tidyBufFree(&errbuf);
        tidyRelease(tdoc);

        return gcnew System::String(outputBytes);
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM