简体   繁体   English

如何从文件“ HANDLE”中获取一个“ HANDLE”到包含目录?

[英]How do I get a HANDLE to the containing directory from a file HANDLE?

Given a HANDLE to a file (eg C:\\\\FolderA\\\\file.txt ), I want a function which will return a HANDLE to the containing directory (in the previous example, it would be a HANDLE to C:\\\\FolderA ). 给定一个句柄的文件(例如C:\\\\FolderA\\\\file.txt ),我想这将返回的句柄包含目录(前面例子中的功能,这将是一个HANDLE到C:\\\\FolderA )。 For example: 例如:

HANDLE hFile = CreateFileA(
                  "C:\\FolderA\\file.txt",
                  GENERIC_READ,
                  FILE_SHARE_READ,
                  NULL,
                  OPEN_EXISTING,
                  FILE_ATTRIBUTE_NORMAL,
                  NULL);
HANDLE hDirectory = somefunc(hFile);

Possible implementation for someFunc : someFunc可能实现:

HANDLE someFunc(HANDLE h)
{
    char *path = getPath(h);             // "C:\\FolderA\\file.txt"
    char *parent = getParentPath(path);  // "C:\\FolderA"
    HANDLE hFile = CreateFileA(
              parent,
              GENERIC_READ,
              FILE_SHARE_READ,
              NULL,
              OPEN_EXISTING,
              FILE_ATTRIBUTE_NORMAL,
              NULL);
    free(parent);
    free(path);
    return hFile;
}

But is there a way to implement someFunc without getParentPath or without making it look at the string and removing everything after the last directory separator (because this is terrible from a performance point of view)? 但是,有没有一种方法可以someFunc没有getParentPath情况下实现someFunc ,也可以不使它查看字符串并删除最后一个目录分隔符之后的所有内容(因为从性能角度来看这很糟糕)?

I don't know what getParentPath is. 我不知道什么是getParentPath I assume it's a function that searches for the trailing backslash in the string and uses that to strip off the file specification. 我假设它是一个在字符串中搜索尾部反斜杠并将其用于剥离文件规范的函数。 You don't have to define such a function yourself; 您不必自己定义这样的功能。 Windows already provides one for you— PathCchRemoveFileSpec . Windows已经为您提供了一个PathCchRemoveFileSpec (Note that this assumes the specified path actually contains a file name to remove. If the path doesn't contain a file name, it will remove the trailing directory name. There are other functions you can use to verify whether a path contains a file specification.) (请注意,这假定指定的路径实际上包含要删除的文件名。如果该路径不包含文件名,它将删除结尾的目录名。您可以使用其他功能来验证路径是否包含文件规格。)

The older version of this function is PathRemoveFileSpec , which is what you would use on downlevel operating systems where the newer, safer function is not available. 该功能的较旧版本是PathRemoveFileSpec ,这是您在下级操作系统上将要使用的版本,而较新的,更安全的功能不可用。

Outside of the Windows API, there are other ways of doing the same thing. 在Windows API之外,还有其他方法可以执行相同的操作。 If you're targeting C++17, there is the filesystem::path class. 如果您的目标是C ++ 17,则有filesystem::path类。 Boost provides something similar. Boost提供了类似的功能。 Or you could write it yourself with the find_last_of member function of the std::string class, if you absolutely have to. 或者,如果绝对需要,也可以使用std::string类的find_last_of成员函数自己编写。 (But prefer not to re-invent the wheel. There are lots of edge cases when it comes to path manipulation that you probably won't think of, and that your testing probably won't reveal.) (但最好不要重新发明轮子。当涉及到路径操作时,您可能不会想到很多边缘情况,并且您的测试可能不会发现。)

You express concerns about the performance of this approach. 您对这种方法的性能表示担忧。 This is nonsense. 这是无稽之谈。 Stripping some characters from a string is not a slow operation. 从字符串中剥离某些字符并不是一项缓慢的操作。 It wouldn't even be slow if you started searching from the beginning of the string and then, once you found the file specification, made a second copy of the string, again starting from the beginning of the string. 如果您从字符串的开头开始搜索,然后找到文件规范,然后又从字符串的开头开始创建了字符串的第二个副本,这甚至不会很慢。 It's a simple loop searching through the characters of a reasonable-length string, and then a simple memcpy . 这是一个简单的循环,搜索一个合理长度的字符串的字符,然后搜索一个简单的memcpy There is absolutely no way that this operation could be a performance bottleneck in code that does file I/O. 绝对不可能使该操作成为执行文件I / O的代码中的性能瓶颈。

But, the implementation probably isn't even going to be that naïve. 但是,实现可能还不是那么幼稚。 You can optimize it by starting the search from the end of the path string, reducing the number of characters that you have to iterate through, and you can avoid any type of memory copy altogether if you're allowed to manipulate the original string. 您可以通过从路径字符串的末尾开始搜索来进行优化,以减少必须迭代的字符数,并且如果可以操作原始字符串,则可以完全避免使用任何类型的内存复制。 With a C-style string, you just replace the trailing path separator (the one that demarcates the beginning of the path specification) with a NUL character ( \\0 ). 使用C样式的字符串,只需用NUL字符( \\0 )替换尾随路径分隔符(用于分隔路径规范开头的分隔符)。 With a C++-style string, you just call the erase member function. 使用C ++样式的字符串,您只需调用erase成员函数。

In fact, if you really care about performance, this is virtually guaranteed to be faster than making a system call to retrieve the containing folder from a file object. 实际上,如果您真的在乎性能,那么实际上可以保证这比进行系统调用从文件对象中检索包含的文件夹要快。 System calls are a lot slower than some compiler-generated, inlinable code to iterate through a string and strip out a sub-string. 系统调用比一些编译器生成的,可插入的代码迭代字符串并去除子字符串要慢得多。

Once you have the path to the directory, you can obtain a HANDLE to it by calling the CreateFile function with the FILE_FLAG_BACKUP_SEMANTICS flag. 一旦有了目录的路径,就可以通过使用带有FILE_FLAG_BACKUP_SEMANTICS标志的CreateFile函数来获取该目录的HANDLE (It is necessary to pass that flag if you want to retrieve a handle to a directory. (如果要检索目录的句柄,则必须传递该标志。


I have measured that this is slow and am looking for a faster way. 我测得这很慢,正在寻找更快的方法。

Your measurements are wrong. 您的测量是错误的。 Either you've made the common mistake of benchmarking a debugging build, where the standard library functionality (eg, std::string ) is not optimized, and/or the real performance bottleneck is the file I/O. 您可能犯了基准测试调试版本的常见错误,即标准库功能(例如std::string )未优化,和/或真正的性能瓶颈是文件I / O。 CreateFile is not a speedy function by any stretch of the imagination. 通过任何想象力, CreateFile 都不是一种快速的功能。 I can almost guarantee that is going to be your hotspot. 我几乎可以保证这将是您的热点。


Note that if you don't already have the path, it is straightforward to obtain the path from a HANDLE to a file. 请注意,如果您还没有该路径,则很容易获得从HANDLE到文件的路径。 As was pointed out in the comments, on Windows Vista and later, you simply need to call the GetFinalPathNameByHandle function. 正如评论中指出的那样,在Windows Vista及更高版本上,您只需要调用GetFinalPathNameByHandle函数。 More details are available in this article on MSDN, including sample code and an alternative for use on downlevel versions of Windows. MSDN上的这篇文章提供了更多详细信息,包括示例代码和在Windows较低版本上使用的替代方法。

As was mentioned already in the comments to the question, you can optimize this further by allocating a buffer of length MAX_PATH (or perhaps even larger) on the stack. 正如在问题注释中已经提到的那样,您可以通过在堆栈上分配长度为MAX_PATH (甚至更大)的缓冲区来进一步优化此方法。 That compiles to a single instruction to adjust the stack pointer, so it won't be a performance bottleneck, either. 该指令将编译为一条指令以调整堆栈指针,因此也不会成为性能瓶颈。 (Okay, I lied: you actually will need two instructions—one to create space on the stack, and the other to free the allocated space on the stack. Still not a performance problem.) That way, you don't even have to do any dynamic memory allocation. (好吧,我撒谎:您实际上将需要两条指令-一个指令在堆栈上创建空间,而另一个指令释放在堆栈上分配的空间。仍然不是性能问题。)这样,您甚至不必做任何动态内存分配。

Note that for maximum robustness, especially on Windows 10, you want to handle the case that a path is longer than MAX_PATH . 请注意,为了获得最大的鲁棒性,尤其是在Windows 10上,您要处理路径长于MAX_PATH In such cases, your stack-allocated buffer will be too small, and the function you call to fill it will return an error. 在这种情况下,您分配给堆栈的缓冲区会太小,而您调用以填充它的函数将返回错误。 Handle that error, and allocate a larger buffer on the free store. 处理该错误,并在免费存储区上分配更大的缓冲区。 That will be slower, but this is an edge case and probably not one that is worth optimizing. 这样会比较慢,但这只是一个边缘情况,可能不是值得优化的情况。 The 99% common case will use the stack-allocated buffer. 99%的常见情况将使用堆栈分配的缓冲区。

Furthermore, eryksun points out (in comments to this answer) that, although it is convenient, GetFinalPathNameByHandle requires multiple system calls to map the file object between the NT and DOS namespaces and to normalize the path. 此外,eryksun指出(在此答案的注释中),尽管很方便,但GetFinalPathNameByHandle需要多个系统调用才能在NT和DOS名称空间之间映射文件对象并标准化路径。 I haven't disassembled this function, so I can't confirm his claims, but I have no reason to doubt them. 我尚未反汇编此功能,因此无法确认他的主张,但我没有理由怀疑它们。 Under normal circumstances, you wouldn't worry about this sort of overhead or possible performance costs, but since this seems to be a big concern for your application, you can use eryksun's alternative suggestion of calling GetFileInformationByHandleEx and requesting the FileNameInfo class. 在正常情况下,您不必担心此类开销或可能的性能成本,但是由于这似乎是应用程序的主要问题,因此可以使用eryksun的替代建议,即调用GetFileInformationByHandleEx并请求FileNameInfo类。 GetFileInformationByHandleEx is a general, multi-purpose function that can retrieve all different sorts of information about a file, including the path. GetFileInformationByHandleEx是一个通用的通用函数,可以检索有关文件的所有不同种类的信息,包括路径。 Its implementation is simpler, calling directly down to the native NtQueryInformationFile function. 它的实现更简单,直接调用本机的NtQueryInformationFile函数。 I would have thought GetFinalPathNameByHandle was just a user-mode wrapper providing exactly this service, but eryksun's research suggests it is doing extra work that you might want to avoid if this is truly a performance hot-spot. 我本以为GetFinalPathNameByHandle只是提供此服务的用户模式包装器,但是eryksun的研究表明,它确实在做额外的工作,如果这确实是性能热点,您可能要避免。 I have to qualify this slightly by noting that GetFileInformationByHandleEx , in order to retrieve the FileNameInfo , is going to have to create an I/O Request Packet (IRP) and call down to the underlying device driver. 我必须稍微指出一点,以指出GetFileInformationByHandleEx (为了检索FileNameInfo )将必须创建I / O请求数据包(IRP)并调用基础设备驱动程序。 That's not a cheap operation, so I'm not sure that the additional overhead of normalizing the path is really going to matter. 这不是一个便宜的操作,所以我不确定标准化路径的额外开销是否真的很重要。 But in this case, there's no real harm in using the GetFileInformationByHandleEx approach, since it's a documented function. 但是在这种情况下,使用GetFileInformationByHandleEx方法并没有真正的危害,因为它是一个文档化的函数。


If you've written the code as described but are still having measurable performance problems, then please post that code for someone to review and help you optimize. 如果您已经按照说明编写了代码,但是仍然存在可衡量的性能问题,那么请发布该代码以供其他人进行检查并帮助您进行优化。 The Code Review Stack Exchange site is a great place to get help like that on working code. Code Review Stack Exchange网站是获得类似工作代码帮助的好地方。 Feel free to leave me a link to such a question in a comment under this answer so that I don't miss it. 请随时在此答案下的评论中给我留下这样一个问题的链接,这样我就不会错过它。

Whatever you do, please stop calling the ANSI versions of the Windows API functions (the ones that end with an A suffix). 无论做什么, 停止调用Windows API函数的ANSI版本(以A后缀结尾的版本)。 You want the wide-character (Unicode) versions. 您需要宽字符(Unicode)版本。 These end with a W suffix, and work with strings composed of WCHAR (== wchar_t ) characters. 这些以W后缀结尾,并使用由WCHAR (== wchar_t )字符组成的字符串。 Aside from the fact that the ANSI versions have been deprecated for decades now because they do not provide Unicode support (it is not optional for any application written after the year 2000 to support Unicode characters in paths), as much as you care about performance, you should be aware of the fact that all A -suffixed API functions are just stubs that convert the passed-in ANSI string to a Unicode string and then delegate to the W -suffixed version. 除了ANSI版本已经过了几十年的事实,因为它们不提供Unicode支持(对于2000年之后编写的任何应用程序都不支持路径中的Unicode字符,这不是可选的),就像您关心性能一样,您应该知道以下事实:所有带有A后缀的API函数都是存根,它们将传入的ANSI字符串转换为Unicode字符串,然后委托给带有W版本。 If the function returns a string, a second conversion also must be done by the A -suffixed version, since all native APIs work with Unicode strings. 如果函数返回字符串,则必须使用A后缀的版本进行第二次转换,因为所有本机API均使用Unicode字符串。 Performance isn't the real reason why you should avoid calling ANSI functions, but perhaps it's one that you'll find more convincing. 性能并不是您应该避免调用ANSI函数的真正原因,但也许这是使您更有说服力的原因。

There might be a way to do what you want (map a file object via a HANDLE to its containing directory), but it would require undocumented usage of the NT native API. 可能有一种方法可以执行您想要的操作(通过HANDLE将文件对象映射到其包含的目录),但是这需要使用未记录的NT本机API。 I don't see anything at all in the documented functions that would allow you to obtain this information. 我在记录的函数中根本看不到任何可以让您获得此信息的东西。 It certainly isn't accessible via the GetFileInformationByHandleEx function. 当然,不能通过GetFileInformationByHandleEx函数访问它。 For better or worse, the user-mode file system API is almost entirely path-based. 不管好坏,用户模式文件系统API几乎完全基于路径。 Presumably, it is tracked internally, but even the documented NT native API functions that take a root directory HANDLE (eg, NtDeleteFile via the OBJECT_ATTRIBUTES structure) allow this field to be NULL, in which case the full path string is used. 据推测,它是在内部跟踪的,但是即使采用根目录HANDLE的已记录的NT本机API函数(例如,通过OBJECT_ATTRIBUTES结构的NtDeleteFile )也允许该字段为NULL,在这种情况下,将使用完整路径字符串。

As always, if you had provided more details on the bigger picture, we could probably provide a more appropriate solution. 与往常一样,如果您提供了更多的详细信息,我们可能会提供更合适的解决方案。 This is what the commenters were driving at when they mentioned an XY problem. 这是评论者提到XY问题时要进行的工作。 Yes, people are questioning your motives because that's how we provide the most appropriate help. 是的,人们在质疑您的动机,因为这是我们提供最适当帮助的方式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM