简体   繁体   中英

How do I get a HANDLE to the containing directory from a file HANDLE?

Given a HANDLE to a file (eg C:\\\\FolderA\\\\file.txt ), I want a function which will return a HANDLE to the containing directory (in the previous example, it would be a HANDLE to C:\\\\FolderA ). For example:

HANDLE hFile = CreateFileA(
                  "C:\\FolderA\\file.txt",
                  GENERIC_READ,
                  FILE_SHARE_READ,
                  NULL,
                  OPEN_EXISTING,
                  FILE_ATTRIBUTE_NORMAL,
                  NULL);
HANDLE hDirectory = somefunc(hFile);

Possible implementation for someFunc :

HANDLE someFunc(HANDLE h)
{
    char *path = getPath(h);             // "C:\\FolderA\\file.txt"
    char *parent = getParentPath(path);  // "C:\\FolderA"
    HANDLE hFile = CreateFileA(
              parent,
              GENERIC_READ,
              FILE_SHARE_READ,
              NULL,
              OPEN_EXISTING,
              FILE_ATTRIBUTE_NORMAL,
              NULL);
    free(parent);
    free(path);
    return hFile;
}

But is there a way to implement someFunc without getParentPath or without making it look at the string and removing everything after the last directory separator (because this is terrible from a performance point of view)?

I don't know what getParentPath is. I assume it's a function that searches for the trailing backslash in the string and uses that to strip off the file specification. You don't have to define such a function yourself; Windows already provides one for you— PathCchRemoveFileSpec . (Note that this assumes the specified path actually contains a file name to remove. If the path doesn't contain a file name, it will remove the trailing directory name. There are other functions you can use to verify whether a path contains a file specification.)

The older version of this function is PathRemoveFileSpec , which is what you would use on downlevel operating systems where the newer, safer function is not available.

Outside of the Windows API, there are other ways of doing the same thing. If you're targeting C++17, there is the filesystem::path class. Boost provides something similar. Or you could write it yourself with the find_last_of member function of the std::string class, if you absolutely have to. (But prefer not to re-invent the wheel. There are lots of edge cases when it comes to path manipulation that you probably won't think of, and that your testing probably won't reveal.)

You express concerns about the performance of this approach. This is nonsense. Stripping some characters from a string is not a slow operation. It wouldn't even be slow if you started searching from the beginning of the string and then, once you found the file specification, made a second copy of the string, again starting from the beginning of the string. It's a simple loop searching through the characters of a reasonable-length string, and then a simple memcpy . There is absolutely no way that this operation could be a performance bottleneck in code that does file I/O.

But, the implementation probably isn't even going to be that naïve. You can optimize it by starting the search from the end of the path string, reducing the number of characters that you have to iterate through, and you can avoid any type of memory copy altogether if you're allowed to manipulate the original string. With a C-style string, you just replace the trailing path separator (the one that demarcates the beginning of the path specification) with a NUL character ( \\0 ). With a C++-style string, you just call the erase member function.

In fact, if you really care about performance, this is virtually guaranteed to be faster than making a system call to retrieve the containing folder from a file object. System calls are a lot slower than some compiler-generated, inlinable code to iterate through a string and strip out a sub-string.

Once you have the path to the directory, you can obtain a HANDLE to it by calling the CreateFile function with the FILE_FLAG_BACKUP_SEMANTICS flag. (It is necessary to pass that flag if you want to retrieve a handle to a directory.


I have measured that this is slow and am looking for a faster way.

Your measurements are wrong. Either you've made the common mistake of benchmarking a debugging build, where the standard library functionality (eg, std::string ) is not optimized, and/or the real performance bottleneck is the file I/O. CreateFile is not a speedy function by any stretch of the imagination. I can almost guarantee that is going to be your hotspot.


Note that if you don't already have the path, it is straightforward to obtain the path from a HANDLE to a file. As was pointed out in the comments, on Windows Vista and later, you simply need to call the GetFinalPathNameByHandle function. More details are available in this article on MSDN, including sample code and an alternative for use on downlevel versions of Windows.

As was mentioned already in the comments to the question, you can optimize this further by allocating a buffer of length MAX_PATH (or perhaps even larger) on the stack. That compiles to a single instruction to adjust the stack pointer, so it won't be a performance bottleneck, either. (Okay, I lied: you actually will need two instructions—one to create space on the stack, and the other to free the allocated space on the stack. Still not a performance problem.) That way, you don't even have to do any dynamic memory allocation.

Note that for maximum robustness, especially on Windows 10, you want to handle the case that a path is longer than MAX_PATH . In such cases, your stack-allocated buffer will be too small, and the function you call to fill it will return an error. Handle that error, and allocate a larger buffer on the free store. That will be slower, but this is an edge case and probably not one that is worth optimizing. The 99% common case will use the stack-allocated buffer.

Furthermore, eryksun points out (in comments to this answer) that, although it is convenient, GetFinalPathNameByHandle requires multiple system calls to map the file object between the NT and DOS namespaces and to normalize the path. I haven't disassembled this function, so I can't confirm his claims, but I have no reason to doubt them. Under normal circumstances, you wouldn't worry about this sort of overhead or possible performance costs, but since this seems to be a big concern for your application, you can use eryksun's alternative suggestion of calling GetFileInformationByHandleEx and requesting the FileNameInfo class. GetFileInformationByHandleEx is a general, multi-purpose function that can retrieve all different sorts of information about a file, including the path. Its implementation is simpler, calling directly down to the native NtQueryInformationFile function. I would have thought GetFinalPathNameByHandle was just a user-mode wrapper providing exactly this service, but eryksun's research suggests it is doing extra work that you might want to avoid if this is truly a performance hot-spot. I have to qualify this slightly by noting that GetFileInformationByHandleEx , in order to retrieve the FileNameInfo , is going to have to create an I/O Request Packet (IRP) and call down to the underlying device driver. That's not a cheap operation, so I'm not sure that the additional overhead of normalizing the path is really going to matter. But in this case, there's no real harm in using the GetFileInformationByHandleEx approach, since it's a documented function.


If you've written the code as described but are still having measurable performance problems, then please post that code for someone to review and help you optimize. The Code Review Stack Exchange site is a great place to get help like that on working code. Feel free to leave me a link to such a question in a comment under this answer so that I don't miss it.

Whatever you do, please stop calling the ANSI versions of the Windows API functions (the ones that end with an A suffix). You want the wide-character (Unicode) versions. These end with a W suffix, and work with strings composed of WCHAR (== wchar_t ) characters. Aside from the fact that the ANSI versions have been deprecated for decades now because they do not provide Unicode support (it is not optional for any application written after the year 2000 to support Unicode characters in paths), as much as you care about performance, you should be aware of the fact that all A -suffixed API functions are just stubs that convert the passed-in ANSI string to a Unicode string and then delegate to the W -suffixed version. If the function returns a string, a second conversion also must be done by the A -suffixed version, since all native APIs work with Unicode strings. Performance isn't the real reason why you should avoid calling ANSI functions, but perhaps it's one that you'll find more convincing.

There might be a way to do what you want (map a file object via a HANDLE to its containing directory), but it would require undocumented usage of the NT native API. I don't see anything at all in the documented functions that would allow you to obtain this information. It certainly isn't accessible via the GetFileInformationByHandleEx function. For better or worse, the user-mode file system API is almost entirely path-based. Presumably, it is tracked internally, but even the documented NT native API functions that take a root directory HANDLE (eg, NtDeleteFile via the OBJECT_ATTRIBUTES structure) allow this field to be NULL, in which case the full path string is used.

As always, if you had provided more details on the bigger picture, we could probably provide a more appropriate solution. This is what the commenters were driving at when they mentioned an XY problem. Yes, people are questioning your motives because that's how we provide the most appropriate help.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM