简体   繁体   English

C# Parallel.ForEach() 内存使用量不断增长

[英]C# Parallel.ForEach() memory usage keeps growing

public string SavePath { get; set; } = @"I:\files\";

public void DownloadList(List<string> list)
{
    var rest = ExcludeDownloaded(list);
    var result = Parallel.ForEach(rest, link=>
    {
        Download(link);
    });
}

private void Download(string link)
{
    using(var net = new System.Net.WebClient())
    {
        var data = net.DownloadData(link);

        var fileName = code to generate unique fileName;
        if (File.Exists(fileName))
            return;

        File.WriteAllBytes(fileName, data);
    }
}

var downloader = new DownloaderService();
var links = downloader.GetLinks();
downloader.DownloadList(links);

I observed the usage of RAM for the project keeps growing我观察到该项目的 RAM 使用量不断增长在此处输入图片说明

I guess there is something wrong on the Parallel.ForEach(), but I cannot figure it out.我猜 Parallel.ForEach() 有问题,但我想不通。

Is there the memory leak, or what is happening?是否存在内存泄漏,或者发生了什么?


Update 1更新 1

After changed to the new code改成新代码后

private void Download(string link)
{
    using(var net = new System.Net.WebClient())
    {
        var fileName = code to generate unique fileName;
        if (File.Exists(fileName))
            return;
        var data = net.DownloadFile(link, fileName);
        Track theTrack = new Track(fileName);
        theTrack.Title = GetCDName();
        theTrack.Save();
    }
}

在此处输入图片说明

I still observed increasing memory use after keeping running for 9 hours, it is much slowly growing usage though.在保持运行 9 小时后,我仍然观察到内存使用量增加,但使用量增长缓慢。

Just wondering, is it because that I didn't free the memory use of theTrack file?只是想知道,是不是因为我没有释放 theTrack 文件的内存使用?

Btw, I use ALT package for update file metadata, unfortunately, it doesn't implement IDisposable interface.顺便说一句,我使用ALT 包来更新文件元数据,不幸的是,它没有实现 IDisposable 接口。

使用WebClient.DownloadFile()直接下载到文件,这样您就不会在内存中保存整个文件。

The Parallel.ForEach method is intended for parallelizing CPU-bound workloads. Parallel.ForEach方法旨在并行化受 CPU 限制的工作负载。 Downloading a file is an I/O bound workload, and so the Parallel.ForEach is not ideal for this case because it needlessly blocks ThreadPool threads.下载文件是 I/O 绑定的工作负载,因此Parallel.ForEach不适合这种情况,因为它不必要地阻塞了ThreadPool线程。 The correct way to do it is asynchronously, with async/await.正确的做法是异步的,使用 async/await。 The recommended class for making asynchronous web requests is the HttpClient , and for controlling the level of concurrency an excellent option is the TPL Dataflow library.推荐的异步 Web 请求类是HttpClient ,控制并发级别的一个很好的选择是TPL 数据流库。 For this case it is enough to use the simplest component of this library, the ActionBlock class:对于这种情况,使用这个库中最简单的组件ActionBlock类就足够了:

async Task DownloadListAsync(List<string> list)
{
    using (var httpClient = new HttpClient())
    {
        var rest = ExcludeDownloaded(list);
        var block = new ActionBlock<string>(async link =>
        {
            await DownloadFileAsync(httpClient, link);
        }, new ExecutionDataflowBlockOptions()
        {
            MaxDegreeOfParallelism = 10
        });
        foreach (var link in rest)
        {
            await block.SendAsync(link);
        }
        block.Complete();
        await block.Completion;
    }
}

async Task DownloadFileAsync(HttpClient httpClient, string link)
{
    var fileName = Guid.NewGuid().ToString(); // code to generate unique fileName;
    var filePath = Path.Combine(SavePath, fileName);
    if (File.Exists(filePath)) return;
    var response = await httpClient.GetAsync(link);
    response.EnsureSuccessStatusCode();
    using (var contentStream = await response.Content.ReadAsStreamAsync())
    using (var fileStream = new FileStream(filePath, FileMode.Create,
        FileAccess.Write, FileShare.None, 32768, FileOptions.Asynchronous))
    {
        await contentStream.CopyToAsync(fileStream);
    }
}

The code for downloading a file with HttpClient is not as simple as the WebClient.DownloadFile() , but it's what you have to do in order to keep the whole process asynchronous (both reading from the web and writing to the disk).使用HttpClient下载文件的代码不像WebClient.DownloadFile()那样简单,但这是您必须执行的操作,以保持整个过程异步(从 Web 读取和写入磁盘)。


Caveat: Asynchronous filesystem operations are currently not implemented efficiently in .NET.警告:异步文件系统操作目前在 .NET 中没有有效实现 For maximum efficiency it may be preferable to avoid using the FileOptions.Asynchronous option in the FileStream constructor.为了获得最大效率,最好避免在FileStream构造函数中使用FileOptions.Asynchronous选项。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM