简体   繁体   English

C++ 中使用 std::copy 和异步任务的所有目录中的文件列表

[英]list of files in all the directories using std::copy and async tasks in C++

This is a program that does a directory tree listing using asynchronous tasks in C++.这是一个使用 C++ 中的异步任务执行目录树列表的程序。

my problem is in each function call the variable 'vect' is created as a local variable and in each function call, we have a list of files in a directory but at the end all the files in all the directories are returned into the main?我的问题是在每次 function 调用中,变量“vect”被创建为局部变量,在每次 function 调用中,我们都有一个目录中的文件列表,但最后所有目录中的所有文件都返回到主目录中? how is it possible?这怎么可能?

I mean how come the 'vect' variable which is a local variable to each function call, keeps the file name of each directory generated by a separate function call?.我的意思是,作为每个 function 调用的局部变量的“vect”变量如何保留由单独的 function 调用生成的每个目录的文件名? this 'vect' acts like it is a global variable: Is it because of "std:?copy"?这个“vect”就像一个全局变量:是因为“std:?copy”吗? I don't understand it!我不明白!

#include <algorithm>
#include <filesystem>
#include <future>
#include <iostream>
#include <vector>

typedef std::vector<std::filesystem::directory_entry> vectDirEntry;

vectDirEntry ListDirectory2(std::filesystem::directory_entry&& dirPath)
{
    std::vector<std::future<std::vector<std::filesystem::directory_entry>>> finalVect;
    vectDirEntry vect;

    for (const std::filesystem::directory_entry& entry : std::filesystem::directory_iterator(dirPath))
    {
        if (entry.is_directory())
        {

            
            std::future<vectDirEntry> fut = std::async(std::launch::async, &ListDirectory2, entry);
            finalVect.push_back(std::move(fut));
        }
        else if (entry.is_regular_file())
        {

            vect.push_back(entry);

        }
    }

    std::for_each(finalVect.begin(), finalVect.end(), [&](std::future<std::vector<std::filesystem::directory_entry>>& fut)

        {
            vectDirEntry lst = fut.get();
            std::copy(lst.begin(), lst.end(), std::back_inserter(vect));
            
        }

    );
    return vect;
}


int main()
{

    const std::filesystem::directory_entry root = std::filesystem::directory_entry("C:/Test");
    std::future<std::vector<std::filesystem::directory_entry>> fut = std::async(std::launch::async, &ListDirectory2, root);
    auto result = fut.get();

    for (std::filesystem::directory_entry& item : result)
    {

        std::cout << item << '\n';

    }
}

There is a separate vect for each recursive call.每个递归调用都有一个单独的vect But you return it, and the future generated from std::async provides the vect from each call.但是你返回它,并且从std::async生成的未来提供了每次调用的vect When you do:当你这样做时:

        vectDirEntry lst = fut.get();
        std::copy(lst.begin(), lst.end(), std::back_inserter(vect));

for each of the std::async dispatched futures, you consume their vect s to populate the parent's vect (which it in turn returns).对于每个std::async调度的期货,您使用它们的vect来填充父级的vect (它反过来返回)。

The lst in that code is the vect returned by one of your recursive calls.该代码中的lst是您的一个递归调用返回的vect The vect in that std::copy is the vect from the current ListDirectory2 call, implicitly received by reference (because you began the lambda definition with [&] , which means any variables referenced that are not declared within the lambda are implicitly references to the variables in the outer scope).std::copy中的vect是来自当前ListDirectory2调用的vect ,通过引用隐式接收(因为您以[&]开始 lambda 定义,这意味着未在 lambda 中声明的任何引用变量都隐式引用了外部范围内的变量)。

There's nothing unusual here;这里没有什么不寻常的; you explicitly copied from the sub- vect s into the parent vect before returning each time, eventually building up a final vect in the top-most ListDirectory2 call that contains the results from every recursive call.在每次返回之前,您明确地从子vect复制到父vect ,最终在最顶层的ListDirectory2调用中构建最终vect ,其中包含每个递归调用的结果。

As a side-note, you're performing a number of copies that aren't strictly necessary.作为旁注,您正在执行一些并非绝对必要的副本。 You could avoid at least some of them by replacing your use of std::copy with std::move (in addition to the single argument version that makes an r-value reference from an l-value, there's a three-arg version equivalent to std::copy that moves from the source; since the lst argument expires at the end of each function call, there's no harm in emptying it).您可以通过将std::copy替换为std::move来避免其中的至少一部分(除了从左值生成右值引用的单参数版本外,还有一个等效的三参数版本到从源移动的std::copy ;由于lst参数在每个 function 调用结束时过期,清空它没有坏处)。 A similar change could be make using the insert method of vect and std::make_move_iterator (and might be slightly faster by allowing the vector to resize in bulk up-front for each bulk move), but the simple swap from std::copy to std::move is the minimalist solution and it should be fast enough.可以使用vectstd::make_move_iteratorinsert方法进行类似的更改(并且通过允许向量在每次批量移动之前批量调整大小可能会稍微快一些),但是从std::copy到的简单交换std::move是极简主义的解决方案,它应该足够快。

What you observe has nothing to do with async calls but is due to recursion.您观察到的与async调用无关,而是由于递归。

Here's a flowchart describing it for 3 directory levels.这是一个流程图,描述了 3 个目录级别。 Each vect is here given a unique name (and they are unique instances in the program).这里的每个vect都被赋予了一个唯一的名称(它们在程序中唯一的实例)。

ListDirectory2(dir)
vect <- file1.1   // put all files in dir in the local vect
        file1.2
dir1 ---------------> ListDirectory2(dir1) // call ListDirectory2 for each dir
                      vect1 <- file1.1 // put all files in dir1 in the local vect
                               file1.2
                      dir1.1 ---------------> ListDirectory2(dir1.1)
                                              ...
                      vect1 <- std::copy <--- return vect1.1
                      dir1.2 ---------------> ListDirectory2(dir1.2)
                                              ...
                      vect1 <- std::copy <--- return vect1.2
vect <- std::copy <-- return vect1

dir2 ---------------> ListDirectory2(dir2)
                      vect2 <- file2.1 // put all files in dir2 in the local vect
                               file2.2
                      dir2.1 ---------------> ListDirectory2(dir2.1)
                                              ...
                      vect2 <- std::copy <--- return vect2.1
                      dir2.2 ---------------> ListDirectory2(dir2.2)
                                              ...
                      vect2 <- std::copy <--- return vect2.2
vect <- std::copy <-- return vect2
return vect

When the call returns to main , vect will therefore be populated with all the files encountered from the starting directory and down.当调用返回到main时, vect将因此填充从起始目录及以下遇到的所有文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM