简体   繁体   English

等待所有线程完成

[英]Wait for all threads to finish

I would like to process a filesystem/folder for subdirectories and files in C#. 我想为C#中的子目录和文件处理文件系统/文件夹。 I'm using Tasks from the TPL library. 我正在使用TPL库中的任务。 The idea is to do it recursively and create for every folder a task. 这个想法是递归地做,并为每个文件夹创建一个任务。 The main thread should wait for the child threads to finish and then print some info. 主线程应等待子线程完成,然后打印一些信息。 In fact I just want to know when scanning is finished. 实际上,我只想知道扫描何时完成。 I have started with threadpool, then switched to TLP. 我从线程池开始,然后切换到TLP。 Did some easy examples. 做了一些简单的例子。 after some tries from simple code to more and more bloated code I'm stuck here: 经过一些尝试,从简单的代码到越来越膨胀的代码,我被困在这里:

private Logger log = LogManager.GetCurrentClassLogger();

public MediaObjectFolder MediaObjectFolder { get; set; }
private Queue<MediaObjectFolder> Queue { get; set; }

private object quelock, tasklock;
private List<Task> scanTasks;

public IsoTagger()
{
    quelock = new object();
    tasklock = new object();
    scanTasks = new List<Task>();

    MediaObjectFolder = new MediaObjectFolder(@"D:\Users\Roman\Music\Rock\temp");
    Queue = new Queue<MediaObjectFolder>();
}

public MediaObject RescanFile(string fullpath, string filename)
{
    return new MediaObject(fullpath);
}

public void Rescan()
{
    Queue.Clear();

    lock (tasklock)
    {
        Task scanFolderTask = Task.Factory.StartNew(ScanFolder, MediaObjectFolder);
        scanTasks.Add(scanFolderTask);
    }

    Task.Factory.ContinueWhenAll(scanTasks.ToArray(), (ant) =>
        {
            if (log != null)
            {
                log.Debug("scan finished");
                log.Debug("number of folders: {0}", Queue.Count);
            }

        });
}

private void ScanFolder(object o)
{
    List<Task> subTasks = new List<Task>();

    MediaObjectFolder mof = o as MediaObjectFolder;
    log.Debug("thread - " + mof.Folder);

    string[] subdirs = Directory.GetDirectories(mof.Folder);
    string[] files = Directory.GetFiles(mof.Folder, "*.mp3");


    foreach(string dir in subdirs)
    {
        log.Debug(dir);

        MediaObjectFolder tmp = new MediaObjectFolder(dir);
        lock (tasklock)
        {
            Task tmpTask = new Task(ScanFolder, tmp);
            subTasks.Add(tmpTask);
        }
    }

    foreach (Task tsk in subTasks)
    {
        tsk.Start();
    }

    foreach (string file in files)
    {
        log.Debug(file);

        MediaObject tmp = new MediaObject(file);
        MediaObjectFolder.MediaObjects.Add(tmp);
    }

    lock (quelock)
    {
        Queue.Enqueue(mof);
    }

    if (subTasks != null)
        Task.Factory.ContinueWhenAll(subTasks.ToArray(), logTask => log.Debug("thread release - " + mof.Folder));
}

Main thread still sometimes continues too early and not after finishing of all other threads. 主线程有时仍会继续进行得太早,而不是在所有其他线程结束之后才继续。 (I'm relatively new to C# and not an expert in parallel programming too, so there might be some heavy-weight concept errors) (我是C#的新手,也不是并行编程专家,因此可能会有一些重量级的概念错误)

The general approach that you're taking inherently makes this a fairly hard problem to solve. 您固有的通用方法固有地使这个问题很难解决。 Instead, you can simply use the file system methods to traverse the hierarchy for you, and then use PLINQ to process those files in parallel effectively: 相反,您可以简单地使用文件系统方法为您遍历层次结构,然后使用PLINQ有效地并行处理这些文件:

var directories = Directory.EnumerateDirectories(path, "*"
    , SearchOption.AllDirectories);

var query = directories.AsParallel().Select(dir =>
{
    var files = Directory.EnumerateFiles(dir, "*.mp3"
        , SearchOption.TopDirectoryOnly);
    //TODO create custom object and add files
});

You'll want to research the Task.WaitAll and Task.WaitAny methods. 您将要研究Task.WaitAll和Task.WaitAny方法。 There is example code here: msdn.microsoft.com 这里有示例代码: msdn.microsoft.com

For the quick answer: 快速答案:

Task.WaitAll(subTasks);

should work for you. 应该为您工作。

after good suggestions by Servy and further research about Parallelism in C# i came up with an answer to my question. 在Servy提出了很好的建议并进一步研究了C#中的并行性之后,我想出了我的问题的答案。 As i don't really need LINQ for this simple task, where i just want to enumerate my filesystem and process the folders parallel. 因为我真的不需要LINQ来完成这个简单的任务,所以我只想枚举文件系统并并行处理文件夹。

public void Scan()
{
    // ...
    // enumerate all directories under one root folder (mof.Folder)
    var directories = Directory.EnumerateDirectories(mof.Folder, "*", SearchOption.AllDirectories);
    // use parallel foreach from TPL to process folders
    Parallel.ForEach(directories, ProcessFolder);
    // ...
}

private void ProcessFolder(string folder)
{
    if (!Directory.Exists(folder))
    {
        throw new ArgumentException("root folder does not exist!");
    }
    MediaObjectFolder mof = new MediaObjectFolder(folder);
    IEnumerable<string> files = Directory.EnumerateFiles(folder, "*.mp3", SearchOption.TopDirectoryOnly);
    foreach (string file in files)
    {
        MediaObject mo = new MediaObject(file);
        mof.MediaObjects.Add(mo);
    }
    lock (quelock)
    {
         // add object to global queue
         Enqueue(mof);
    }
}

after a quite an intensive research i found this as the easiest solution. 经过大量的研究,我发现这是最简单的解决方案。 please note: i haven't done any tests if this approach is faster, as i work on a temp file base which is not really big. 请注意:如果这种方法速度更快,我还没有做任何测试,因为我使用的临时文件库并不是很大。 this is also the way described in the MSDN library for parallel processing of the filesystem. 这也是MSDN库中描述的用于文件系统并行处理的方式。

PS: there is also a lot of space for improvement of performance PS:还有很多提升性能的空间

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM