简体   繁体   English

DOM 解析器 Chrome 扩展 memory 泄漏

[英]DOM Parser Chrome extension memory leak

The problem问题

I have developed an extension that intercepts web requests, gets the HTML the web request originated from and processes it.我开发了一个拦截 web 请求的扩展,获取 HTML web 请求并对其进行处理。 I have used the DOMParser to parse the HTML and I have realised that the DOMParser is causing massive memory leak issues, which eventually causes the chrome extension to crash.我已经使用 DOMParser 解析 HTML,我意识到 DOMParser 导致了大量的 memory 泄漏问题,最终导致 chrome 扩展程序崩溃。

This is the code that causes the issues.这是导致问题的代码。 https://gist.github.com/uche1/20929b6ece7d647250828c63e4a2ffd4 https://gist.github.com/uche1/20929b6ece7d647250828c63e4a2ffd4

What I've tried我试过的

Dev Tools Recorded Performance开发工具记录的性能

I have recorded the chrome extension whilst it's intercepting requests and I noticed that as the DOMParser.parseFromString method was called, the more nodes and documents were created which weren't destroyed.我在拦截请求时记录了 chrome 扩展,我注意到在调用 DOMParser.parseFromString 方法时,创建了更多未被破坏的节点和文档。

Dev tools screenshot https://i.imgur.com/pMY50kR.png开发工具截图https://i.imgur.com/pMY50kR.png

Task Manager Memory Footprint任务管理器 Memory 足迹

I looked at the task manager on chrome and saw that it had a huge memory footprint that wouldn't decrease with time (because garbage collection should kick in after a while).我查看了 chrome 上的任务管理器,发现它有一个巨大的 memory 足迹,不会随着时间的推移而减少(因为垃圾收集应该在一段时间后开始)。 When the memory footprint gets too large the extension crashes.当 memory 占用空间太大时,扩展会崩溃。

Task manager memory footprint screenshot https://i.imgur.com/c8fLWCy.png任务管理器 memory 足迹截图https://i.imgur.com/c8fLWCy.png

Heap snapshots堆快照

I took some before and after screenshots of the heap and I can see the issue seems to be originating from the HTMLDocuments being allocated that isn't being garbage collected.我拍了一些堆前后的屏幕截图,我可以看到问题似乎源于分配的 HTMLDocuments 没有被垃圾收集。

Snapshot (before) https://i.imgur.com/Rg2CRi6.png快照(之前) https://i.imgur.com/Rg2CRi6.png

Snapshot (after) https://i.imgur.com/UQgLuT1.png快照(后) https://i.imgur.com/UQgLuT1.png

Expected outcome预期结果

I would want to understand why the DOMParser is causing such memory issues, why it isn't being cleaned up by the garbage collector and what to do to resolve it.我想了解为什么 DOMParser 会导致这样的 memory 问题,为什么它没有被垃圾收集器清理以及如何解决它。

Thanks谢谢

You are basically replicating the entire DOM in memory and then never releasing the memory. 您基本上是在内存中复制整个DOM,然后再不释放内存。

We get away with this in a client side app because when we navigate away, the memory used by the scripts on that page is recovered. 我们在客户端应用程序中避免了这种情况,因为当我们离开时,该页面上脚本所使用的内存将被恢复。

In a background script, that doesn't happen and is now your responsibility. 在后台脚本中,这不会发生,现在是您的责任。

So set both parser and document to null when you are done using it. 因此,在使用parserdocument ,请将其都设置为null

chrome.webRequest.onCompleted.addListener(async request => {
    if (request.tabId !== -1) {
        let html = await getHtmlForTab(request.tabId);
        let parser = new DOMParser();
        let document = parser.parseFromString(html, "text/html");
        let title = document.querySelector("title").textContent;
        console.log(title);
        parser = null; // <----- DO THIS
        document = null; // <----- DO THIS
    }
}, requestFilter);

I have resolved the problem. 我已经解决了问题。 It seems like the issue was because the DOMParser class for some reason kept the references of the HTML documents it parsed in memory and didn't release it. 看来问题出在这是因为DOMParser类出于某种原因在内存中保留了它解析的HTML文档的引用,但没有释放它。 Because my extension is a Chrome extension that runs in the background, exaggerates this problem. 由于我的扩展程序是在后台运行的Chrome扩展程序,因此会夸大此问题。

The solution was to use another method of parsing the HTML document which was to use 解决方案是使用另一种解析HTML文档的方法来使用

let parseHtml = (html) => {
    let template = document.createElement('template');
    template.innerHTML = html;
    return template; 
}

This helped resolve the issue. 这有助于解决问题。

I cannot point to a confirmed bug report in Chromium, but we were also hit by the memory leak.我无法指出 Chromium 中已确认的错误报告,但我们也受到了 memory 泄漏的打击。 If you are developing an extension, DOMParser will leak in background scripts on Chromium based browser, but not on Firefox.如果您正在开发扩展, DOMParser将在基于 Chromium 的浏览器上的后台脚本中泄漏,但在 Firefox 上则不会。

We could not get any of the workarounds mentioned here to solve the leak, so we ended up replacing the native DOMParser with the linkedom library, which provides a drop-in replacement and works in the browser (not only in NodeJs).我们无法获得此处提到的任何解决方法来解决泄漏,因此我们最终用linkedom库替换了本机DOMParser ,它提供了一个直接替换并在浏览器中工作(不仅在 NodeJs 中)。 It solves the leaks, so you might consider it, but there are some aspects that you need to be aware of:它解决了泄漏问题,因此您可以考虑使用它,但您需要注意以下几个方面:

  • It will not leak, but its initial memory footprint is higher then using the native parser它不会泄漏,但它的初始 memory 足迹比使用本机解析器高
  • Performance is most likely slower (but I have not benchmarked it)性能很可能较慢(但我没有对其进行基准测试)
  • The DOM generated by its HTML parser might slightly different from what Firefox or Chrome would produce.其 HTML 解析器生成的 DOM 可能与 Firefox 或 Chrome 生成的略有不同。 The effect is most visible in HTML that is broken and where the browsers will attempt to error correct it.这种效果在 HTML 中最为明显,它已损坏,浏览器将尝试对其进行错误更正。

We also tried jsdom first, which tries to be more compatible with the majors browsers at the cost of higher complexity of its codebase.我们也首先尝试了jsdom ,它试图以其代码库的更高复杂性为代价来与主流浏览器更加兼容。 Unfortunately, we found it difficult to make jsdom work in the browser (but on NodeJs it is works well).不幸的是,我们发现很难让 jsdom 在浏览器中工作(但在 NodeJs 上它工作得很好)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM