[英]list of files in all the directories using std::copy and async tasks in C++
This is a program that does a directory tree listing using asynchronous tasks in C++.这是一个使用 C++ 中的异步任务执行目录树列表的程序。
my problem is in each function call the variable 'vect' is created as a local variable and in each function call, we have a list of files in a directory but at the end all the files in all the directories are returned into the main?我的问题是在每次 function 调用中,变量“vect”被创建为局部变量,在每次 function 调用中,我们都有一个目录中的文件列表,但最后所有目录中的所有文件都返回到主目录中? how is it possible?
这怎么可能?
I mean how come the 'vect' variable which is a local variable to each function call, keeps the file name of each directory generated by a separate function call?.我的意思是,作为每个 function 调用的局部变量的“vect”变量如何保留由单独的 function 调用生成的每个目录的文件名? this 'vect' acts like it is a global variable: Is it because of "std:?copy"?
这个“vect”就像一个全局变量:是因为“std:?copy”吗? I don't understand it!
我不明白!
#include <algorithm>
#include <filesystem>
#include <future>
#include <iostream>
#include <vector>
typedef std::vector<std::filesystem::directory_entry> vectDirEntry;
vectDirEntry ListDirectory2(std::filesystem::directory_entry&& dirPath)
{
std::vector<std::future<std::vector<std::filesystem::directory_entry>>> finalVect;
vectDirEntry vect;
for (const std::filesystem::directory_entry& entry : std::filesystem::directory_iterator(dirPath))
{
if (entry.is_directory())
{
std::future<vectDirEntry> fut = std::async(std::launch::async, &ListDirectory2, entry);
finalVect.push_back(std::move(fut));
}
else if (entry.is_regular_file())
{
vect.push_back(entry);
}
}
std::for_each(finalVect.begin(), finalVect.end(), [&](std::future<std::vector<std::filesystem::directory_entry>>& fut)
{
vectDirEntry lst = fut.get();
std::copy(lst.begin(), lst.end(), std::back_inserter(vect));
}
);
return vect;
}
int main()
{
const std::filesystem::directory_entry root = std::filesystem::directory_entry("C:/Test");
std::future<std::vector<std::filesystem::directory_entry>> fut = std::async(std::launch::async, &ListDirectory2, root);
auto result = fut.get();
for (std::filesystem::directory_entry& item : result)
{
std::cout << item << '\n';
}
}
There is a separate vect
for each recursive call.每个递归调用都有一个单独的
vect
。 But you return it, and the future generated from std::async
provides the vect
from each call.但是你返回它,并且从
std::async
生成的未来提供了每次调用的vect
。 When you do:当你这样做时:
vectDirEntry lst = fut.get();
std::copy(lst.begin(), lst.end(), std::back_inserter(vect));
for each of the std::async
dispatched futures, you consume their vect
s to populate the parent's vect
(which it in turn returns).对于每个
std::async
调度的期货,您使用它们的vect
来填充父级的vect
(它反过来返回)。
The lst
in that code is the vect
returned by one of your recursive calls.该代码中的
lst
是您的一个递归调用返回的vect
。 The vect
in that std::copy
is the vect
from the current ListDirectory2
call, implicitly received by reference (because you began the lambda definition with [&]
, which means any variables referenced that are not declared within the lambda are implicitly references to the variables in the outer scope).该
std::copy
中的vect
是来自当前ListDirectory2
调用的vect
,通过引用隐式接收(因为您以[&]
开始 lambda 定义,这意味着未在 lambda 中声明的任何引用变量都隐式引用了外部范围内的变量)。
There's nothing unusual here;这里没有什么不寻常的; you explicitly copied from the sub-
vect
s into the parent vect
before returning each time, eventually building up a final vect
in the top-most ListDirectory2
call that contains the results from every recursive call.在每次返回之前,您明确地从子
vect
复制到父vect
,最终在最顶层的ListDirectory2
调用中构建最终vect
,其中包含每个递归调用的结果。
As a side-note, you're performing a number of copies that aren't strictly necessary.作为旁注,您正在执行一些并非绝对必要的副本。 You could avoid at least some of them by replacing your use of
std::copy
with std::move
(in addition to the single argument version that makes an r-value reference from an l-value, there's a three-arg version equivalent to std::copy
that moves from the source; since the lst
argument expires at the end of each function call, there's no harm in emptying it).您可以通过将
std::copy
替换为std::move
来避免其中的至少一部分(除了从左值生成右值引用的单参数版本外,还有一个等效的三参数版本到从源移动的std::copy
;由于lst
参数在每个 function 调用结束时过期,清空它没有坏处)。 A similar change could be make using the insert
method of vect
and std::make_move_iterator
(and might be slightly faster by allowing the vector to resize in bulk up-front for each bulk move), but the simple swap from std::copy
to std::move
is the minimalist solution and it should be fast enough.可以使用
vect
和std::make_move_iterator
的insert
方法进行类似的更改(并且通过允许向量在每次批量移动之前批量调整大小可能会稍微快一些),但是从std::copy
到的简单交换std::move
是极简主义的解决方案,它应该足够快。
What you observe has nothing to do with async
calls but is due to recursion.您观察到的与
async
调用无关,而是由于递归。
Here's a flowchart describing it for 3 directory levels.这是一个流程图,描述了 3 个目录级别。 Each
vect
is here given a unique name (and they are unique instances in the program).这里的每个
vect
都被赋予了一个唯一的名称(它们在程序中是唯一的实例)。
ListDirectory2(dir)
vect <- file1.1 // put all files in dir in the local vect
file1.2
dir1 ---------------> ListDirectory2(dir1) // call ListDirectory2 for each dir
vect1 <- file1.1 // put all files in dir1 in the local vect
file1.2
dir1.1 ---------------> ListDirectory2(dir1.1)
...
vect1 <- std::copy <--- return vect1.1
dir1.2 ---------------> ListDirectory2(dir1.2)
...
vect1 <- std::copy <--- return vect1.2
vect <- std::copy <-- return vect1
dir2 ---------------> ListDirectory2(dir2)
vect2 <- file2.1 // put all files in dir2 in the local vect
file2.2
dir2.1 ---------------> ListDirectory2(dir2.1)
...
vect2 <- std::copy <--- return vect2.1
dir2.2 ---------------> ListDirectory2(dir2.2)
...
vect2 <- std::copy <--- return vect2.2
vect <- std::copy <-- return vect2
return vect
When the call returns to main
, vect
will therefore be populated with all the files encountered from the starting directory and down.当调用返回到
main
时, vect
将因此填充从起始目录及以下遇到的所有文件。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.