简体   繁体   English

ASP.net解析html以确保安全。 这个可以吗?

[英]ASP.net parsing html to make it safe. Is this ok?

I'm sure this has been asked a number of time but I'm having trouble finding something that matches what I want. 我确信这已被问了很多次,但我找不到符合我想要的东西。 I want to be able to safely render html in my webpage but only allow links, 我希望能够在我的网页中安全地呈现html,但只允许链接,
and

tags 标签

I've come up with the following but want to make sure i've not miseed anything or if there is a better way please let me know. 我想出了以下内容但是想确保我没有什么东西,或者如果有更好的方法请告诉我。

Code: 码:

    private string RemoveEvilTags(string value)
    {
        string[] allowed = { "<br/>", "<p>", "</p>", "</a>", "<a href" };
        string anchorPattern = @"<a[\s]+[^>]*?href[\s]?=[\s\""\']+(?<href>.*?)[\""\']+.*?>(?<fileName>[^<]+|.*?‌​)?<\/a>";            
        string safeText = value;

        System.Text.RegularExpressions.MatchCollection matches = Regex.Matches(value, anchorPattern, RegexOptions.IgnorePatternWhitespace | RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Compiled);
        if (matches.Count > 0)
        {
            foreach (Match m in matches)
            {
                string url = m.Groups["href"].Value;
                string linkText = m.Groups["fileName"].Value;                    

                Uri testUri = null;
                if (Uri.TryCreate(url, UriKind.Absolute, out testUri) && testUri.AbsoluteUri.StartsWith("http"))
                {
                    safeText = safeText.Replace(m.Groups[0].Value, string.Format("<a href=\"{0}\" >{1}</a>", testUri.AbsoluteUri, linkText));
                }
                else
                {
                    safeText = safeText.Replace(m.Groups[0].Value, linkText);
                }
            }
        }

        //Remove everything.
        safeText = System.Text.RegularExpressions.Regex.Replace(safeText, @"<[a-zA-Z\/][^>]*>", m => m != null && allowed.Contains(m.Value) || m.Value.StartsWith("<a href") ? m.Value : String.Empty);

        //Now add them back in.
        return safeText;
    }

Tests: 测试:

    [Test]
    public void EvilTagTest()
    {
        var safeText = RemoveEvilTags("this is a test <p>ok</p>");
        Assert.AreEqual("this is a test <p>ok</p>", safeText);

        safeText = RemoveEvilTags("this is a test <script>ok</script>");
        Assert.AreEqual("this is a test ok", safeText);

        safeText = RemoveEvilTags("this is a test <script><script>ok</script></script>");
        Assert.AreEqual("this is a test ok", safeText);

        //Check relitive link
        safeText = RemoveEvilTags("this is a test <a href=\"bob\" >click here</a>");
        Assert.AreEqual("this is a test click here", safeText);

        //Check full link
        safeText = RemoveEvilTags("this is a test <a href=\"http://test.com/\" >click here</a>");
        Assert.AreEqual("this is a test <a href=\"http://test.com/\" >click here</a>", safeText);

        //Check full link
        safeText = RemoveEvilTags("this is a test <a href=\"https://test.com/\" >click here</a>");
        Assert.AreEqual("this is a test <a href=\"https://test.com/\" >click here</a>", safeText);

        //javascript link
        safeText = RemoveEvilTags("this is a test <a href=\"javascript:evil()\" >click here</a>");
        Assert.AreEqual("this is a test click here", safeText);

        safeText = RemoveEvilTags("this is a test <a href=\"https://test.com/\" ><script>evil();</script>click here</a>");
        Assert.AreEqual("this is a test <a href=\"https://test.com/\" >click here</a>", safeText);
    }

All tests pass but what have i missed? 所有测试都通过但我错过了什么?

Thank you. 谢谢。

For best practice you should not be making your own library to "RemoveEvilTags". 为了获得最佳实践,您不应将自己的库设置为“RemoveEvilTags”。 There are plenty of methods malicious users could use to perform an XSS attack. 恶意用户可以使用大量方法来执行XSS攻击。 ASP.NET provides an Anti XSS Library already: ASP.NET已经提供了一个Anti XSS库:

http://msdn.microsoft.com/en-us/library/aa973813.aspx http://msdn.microsoft.com/en-us/library/aa973813.aspx

Since you're using ASP.NET, Plural Sight has a good video on XSS. 由于您使用的是ASP.NET,因此Plural Sight在XSS上有一个很好的视频。 More focussed towards MVC, however it is still valid in this context. 更专注于MVC,但它在这种情况下仍然有效。

http://www.pluralsight-training.net/microsoft/players/PSODPlayer?author=scott-allen&name=mvc3-building-security&mode=live&clip=0&course=aspdotnet-mvc3-intro http://www.pluralsight-training.net/microsoft/players/PSODPlayer?author=scott-allen&name=mvc3-building-security&mode=live&clip=0&course=aspdotnet-mvc3-intro

Instead of writing such code, I would suggest you to use some html parser such as Html Agility Pack . 我建议你使用一些html解析器,比如Html Agility Pack ,而不是编写这样的代码。

Your code parsing code may run into a lot un-handled of corner cases - hopefully, parser would handle the most of those cases. 你的代码解析代码可能遇到很多未处理的极端情况 - 希望解析器可以处理大多数情况。 Once parsed, you can reject invalid input or allow only valid tags (as per your needs). 解析后,您可以拒绝无效输入或仅允许有效标记(根据您的需要)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM