简体   繁体   中英

How to get “valid data length” of a file?

There is a function to set the "valid data length" value: SetFileValidData , but I didn't find a way to get the "valid data length" value.

I want to know about given file if the EOF is different from the VDL, because writing after the VDL in case of VDL<EOF will cause a performance penalty as described here .

I found this page, claims that:

there is no mechanism to query the value of the VDL

So the answer is "you can't".

If you care about performance you can set the VDL to the EOF, but then note that you may allow access old garbage on your disk - the part between those two pointers, that supposed to be zeros if you would access that file without setting the VDL to point the EOF.

I think you are confused as to what "valid data length" actually means . Check this answer .

Basically, while SetEndOfFile lets you increase the length of a file quickly, and allocates the disk space, if you skip to the (new) end-of-file to write there, all the additionally allocated disk space would need to be overwritten with zeroes, which is kind of slow.

SetFileValidData lets you skip that zeroing-out. You're telling the system, "I am OK with whatever is in those disk blocks, get on with it". (This is why you need the SE_MANAGE_VOLUME_NAME priviledge, as it could reveal priviledged data to unpriviledged users if you don't overwrite the data. Users with this priviledge can access the raw drive data anyway.)

In either case, you have set the new effective size of the file. (Which you can read back.) What, exactly, should a seperate "read file valid data" report back? SetFileValidData told the system that whatever is in those disk blocks is "valid"...


Different approach of explanation:

The documentation mentions that the "valid data length" is being tracked; the purpose for this is for the system to know which range (from end-of-valid-data to end-of-file) it still needs to zero out , in the context of SetEndOfFile , when necessary (eg you closing the file). You don't need to read back this value, because the only way it could be different from the actual file size is because you, yourself, did change it via the aforementioned functions...

The SetValidData (according to MSDN) can be used to create for example a large file without having to write to the file. For a database this will allocate a (contiguous) storage area.

As a result, it seems the file size on disk will have changed without any data having been written to the file.

By implication, any GetValidData (which does not exist) just returns the size of the file, so you can use GetFileSize which returns the "valid" file size.

Looked into this. No way to get this information via any API, even the eg NtQueryInformationFile API (FileEndOfFileInformation only worked with NtSetInformationFile). So finally I read this by manually reading NTFS records. If anyone has a better way, please tell! This also obviously only works with full system access (and NTFS) and might be out of sync with the in-memory information Windows uses.

#pragma pack(push)
#pragma pack(1)
struct NTFSFileRecord
{
    char magic[4];
    unsigned short sequence_offset;
    unsigned short sequence_size;
    uint64 lsn;
    unsigned short squence_number;
    unsigned short hardlink_count;
    unsigned short attribute_offset;
    unsigned short flags;
    unsigned int real_size;
    unsigned int allocated_size;
    uint64 base_record;
    unsigned short next_id;
    //char padding[470];
};

struct MFTAttribute
{
    unsigned int type;
    unsigned int length;
    unsigned char nonresident;
    unsigned char name_lenght;
    unsigned short name_offset;
    unsigned short flags;
    unsigned short attribute_id;
    unsigned int attribute_length;
    unsigned short attribute_offset;
    unsigned char indexed_flag;
    unsigned char padding1;
    //char padding2[488];
};

struct MFTAttributeNonResident
{
    unsigned int type;
    unsigned int lenght;
    unsigned char nonresident;
    unsigned char name_length;
    unsigned short name_offset;
    unsigned short flags;
    unsigned short attribute_id;
    uint64 starting_vnc;
    uint64 last_vnc;
    unsigned short run_offset;
    unsigned short compression_size;
    unsigned int padding;
    uint64 allocated_size;
    uint64 real_size;
    uint64 initial_size;
};
#pragma pack(pop)

HANDLE GetVolumeData(const std::wstring& volfn, NTFS_VOLUME_DATA_BUFFER& vol_data)
{
    HANDLE vol = CreateFileW(volfn.c_str(), GENERIC_WRITE | GENERIC_READ, 
        FILE_SHARE_READ|FILE_SHARE_WRITE|FILE_SHARE_DELETE, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);

    if (vol == INVALID_HANDLE_VALUE)
        return vol;

    DWORD ret_bytes;
    BOOL b = DeviceIoControl(vol, FSCTL_GET_NTFS_VOLUME_DATA,
        NULL, 0, &vol_data, sizeof(vol_data), &ret_bytes, NULL);

    if (!b)
    {
        CloseHandle(vol);
        return INVALID_HANDLE_VALUE;
    }

    return vol;
}


int64 GetFileValidData(HANDLE file, HANDLE vol, const NTFS_VOLUME_DATA_BUFFER& vol_data)
{
    BY_HANDLE_FILE_INFORMATION hfi;
    BOOL b = GetFileInformationByHandle(file, &hfi);
    if (!b)
        return -1;

    NTFS_FILE_RECORD_INPUT_BUFFER record_in;
    record_in.FileReferenceNumber.HighPart = hfi.nFileIndexHigh;
    record_in.FileReferenceNumber.LowPart = hfi.nFileIndexLow;
    std::vector<BYTE> buf;
    buf.resize(sizeof(NTFS_FILE_RECORD_OUTPUT_BUFFER) + vol_data.BytesPerFileRecordSegment - 1);
    NTFS_FILE_RECORD_OUTPUT_BUFFER* record_out = reinterpret_cast<NTFS_FILE_RECORD_OUTPUT_BUFFER*>(buf.data());
    DWORD bout;
    b = DeviceIoControl(vol, FSCTL_GET_NTFS_FILE_RECORD, &record_in,
        sizeof(record_in), record_out, 4096, &bout, NULL);

    if (!b)
        return -1;

    NTFSFileRecord* record = reinterpret_cast<NTFSFileRecord*>(record_out->FileRecordBuffer);

    unsigned int currpos = record->attribute_offset;
    MFTAttribute* attr = nullptr;
    while ( (attr==nullptr ||
        attr->type != 0xFFFFFFFF  )
        && record_out->FileRecordBuffer + currpos +sizeof(MFTAttribute)<buf.data() + bout)
    {
        attr = reinterpret_cast<MFTAttribute*>(record_out->FileRecordBuffer + currpos);
        if (attr->type == 0x80
            && record_out->FileRecordBuffer + currpos + attr->attribute_offset+sizeof(MFTAttributeNonResident)
                < buf.data()+ bout)
        {
            if (attr->nonresident == 0)
                return -1;

            MFTAttributeNonResident* dataattr = reinterpret_cast<MFTAttributeNonResident*>(record_out->FileRecordBuffer
                + currpos + attr->attribute_offset);
            return dataattr->initial_size;
        }
        currpos += attr->length;
    } 

    return -1;
}

[...]
    NTFS_VOLUME_DATA_BUFFER vol_data;
    HANDLE vol = GetVolumeData(L"\\??\\D:", vol_data);
    if (vol != INVALID_HANDLE_VALUE)
    {
        int64 vdl = GetFileValidData(alloc_test->getOsHandle(), vol, vol_data);
        if(vdl>=0) { [...] }
        [...]
    }
[...]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM