简体   繁体   中英

Why does GetPrivateProfileSection retrieve each character as a two byte value, padding them with a NULL character?

Given this piece of code:

Private Declare Auto Function GetPrivateProfileSection Lib "kernel32" _
        (ByVal lpAppName As String, _
         ByVal lpszReturnBuffer As Byte(), _
         ByVal nSize As Integer, ByVal lpFileName As String) As Integer

Public Class IniClassReader
    Public Function readWholeSection(iniFile as String, section as String) as String()
        Dim buffer As Byte() = New Byte(SECTIONLENGTH) {}
        GetPrivateProfileSection(section, buffer, SECTIONLENGTH, iniFile)
        Dim sectionContent As String = Encoding.Default.GetString(buffer)
        ' Skipped code embedded in the function below, not the point of the question
        return processSectionContent(sectionContent)
    End Function
End Class

I figured out that buffer contains a sequence of bytes interspersed with NULL characters ( \0 ). Hence, sectionContent value is seen by the spying variable feature as 'ent r ie 1 = value 1 ent r ie 2 = value 2' . Each pair key/value is as expected followed by two NULL characters instead of one.

I don't see why each character is stored as a two byte value. Replacing Default by UTF8 gives the same result. I tried with a INI file encoded in UTF8 and Windows-1252 (so called "ANSI" by Microsoft).

I know how to get ride of those extra bytes:

Dim sectionContent As String = Encoding.Default.GetString(buffer)
sectionContent = sectionContent.Replace(Chr(0) & Chr(0), vbNewLine).Replace(Chr(0), "")

But I want to understand what's going on here to apply the best solution instead of some sloppy hack working only on some cases.

The bytes are UTF-16 encoded text. It looks like null character padding because all of your text consists of characters whose encodings fit in the low byte.

The Windows API exposes both an "A" and a "W" version of the function, with the "A" version working in narrow strings and the "W" version working in wide strings. The default for the Windows NT family tree (thus all Windows since XP) is wide as UCS-2/UTF-16 is the "native" Windows character encoding.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM