简体   繁体   English

使用 AngleSharp 插入自定义元素

[英]Inserting Custom Element with AngleSharp

I'm trying to update a site that uses an sanitizer based on AngleSharp to process user-generated HTML content.我正在尝试更新一个使用基于 AngleSharp 的消毒剂来处理用户生成的 HTML 内容的站点。 The site users need to be able to embed iframes, and I am trying to use a whitelist to control what domains the frame can load.站点用户需要能够嵌入 iframe,我正在尝试使用白名单来控制框架可以加载的域。 I'd like to rewrite the 'blocked' iframes to a new custom element "blocked-iframe" that will then be stripped out by the sanitizer, so we can review if other domains need to be added to the whitelist.我想将“被阻止”的 iframe 重写为一个新的自定义元素“blocked-iframe”,然后该元素将被 sanitizer 删除,因此我们可以查看是否需要将其他域添加到白名单中。

I'm trying to use a solution based on this answer: https://stackoverflow.com/a/55276825/794我正在尝试使用基于此答案的解决方案: https://stackoverflow.com/a/55276825/794

It looks like so:它看起来像这样:

    string BlockIFrames(string content)
    {
        var parser = new HtmlParser(new HtmlParserOptions { });

        var doc = parser.Parse(content);

        foreach (var element in doc.QuerySelectorAll("iframe"))
        {
            var src = element.GetAttribute("src");

            if (string.IsNullOrEmpty(src) || !Settings.Sanitization.IFrameWhitelist.Any(wls => src.StartsWith(wls)))
            {
                var newElement = doc.CreateElement("blocked-iframe");
                foreach (var attr in element.Attributes)
                {
                    newElement.SetAttribute(attr.Name, attr.Value);
                }

                element.Insert(AdjacentPosition.BeforeBegin, newElement.OuterHtml);

                element.Remove();
            }
        }

        return doc.FirstElementChild.OuterHtml;
    }

It ostensibly works but I notice that the angle brackets in the new element's tag are being escaped on insertion, so the result just gets written into the page as text.它表面上可行,但我注意到新元素标签中的尖括号在插入时被转义,因此结果只是作为文本写入页面。 I think I could build a map of replacements and just execute them against the string before sending back but I'm wondering if theres a way to do it using AngleSharp's API.我想我可以构建一个 map 替换,然后在发回之前对字符串执行它们,但我想知道是否有一种方法可以使用 AngleSharp 的 API。 The site is using 0.9.9 currently and I'm not sure how far ahead we'll be able to update considering some of the other dependencies in play.该网站目前使用的是 0.9.9,考虑到其他一些正在使用的依赖项,我不确定我们能够更新多远。

Digging around in the source I found the ReplaceChild method in INode , which works if called from the parent of element在源代码中挖掘我在INode中找到了ReplaceChild方法,如果从element的父级调用该方法则有效

    string BlockIFrames(string content)
    {
        var parser = new HtmlParser(new HtmlParserOptions { });

        var doc = parser.Parse(content);

        foreach (var element in doc.QuerySelectorAll("iframe"))
        {
            var src = element.GetAttribute("src");

            if (string.IsNullOrEmpty(src) ||
                !Settings.Sanitization.IFrameWhitelist.Any(wls => src.StartsWith(wls)))
            {
                var newElement = doc.CreateElement("blocked-iframe");
                foreach (var attr in element.Attributes)
                {
                    newElement.SetAttribute(attr.Name, attr.Value);
                }

                element.Parent.ReplaceChild(newElement, element);
            }
        }

        return doc.FirstElementChild.OuterHtml;
    }

I will keep testing but this seems decent enough to me, if there is a better way I'd love to hear it.我会继续测试,但这对我来说似乎足够体面,如果有更好的方法我很想听。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM