获取以字节为单位的大小，当写入文件时，该字符串将占用多少？

Question

I've been reading answers that explains how to get the size of a string, size in memory or size in file: 我一直在阅读解释如何获取字符串大小，内存大小或文件大小的答案：

My intention is to detemine the amount of bytes that a string will occupy, in specified encoding, when written to file. 我的意图是确定在写入文件时，字符串将以指定的编码占用的字节数。

However, my function does not return the expected result when I check the size of a string for Encoding.UTF8 , Encoding.Unicode (UTF-16) or Encoding.UTF32 . 但是，当我检查Encoding.UTF8 ， Encoding.Unicode （UTF-16）或Encoding.UTF32的字符串大小时，我的函数没有返回预期的结果。

This is what I'm doing: 这就是我在做什么：

''' ----------------------------------------------------------------------
''' <summary>
''' Gets the size, in bytes, of how much a string will occupy when written to a file.
''' </summary>
''' ----------------------------------------------------------------------
<DebuggerStepThrough>
<Extension>
Public Function SizeInFile(ByVal sender As String,
                           Optional ByVal encoding As Encoding = Nothing) As Integer

    If (encoding Is Nothing) Then
        encoding = System.Text.Encoding.Default
    End If

    Return encoding.GetByteCount(sender)

End Function

This is how I'm testing it, in the code below, the function says the string size is 2 bytes, but when written to a file the filesize is 4 bytes: 这就是我测试的方式，在下面的代码中，该函数说字符串大小为2个字节，但是当写入文件时，文件大小为4个字节：

Dim str As String = "Ñ"
Console.WriteLine(String.Format("Size of String : {0}", str.SizeInFile(Encoding.Unicode)))

File.WriteAllText(".\Test.txt", str, Encoding.Unicode)
Console.WriteLine(String.Format("Size of txtfile: {0}", New FileInfo(".\Test.txt").Length))

What am I missing to perform an efficient evaluation of the string size?. 我缺少有效评估字符串大小的内容吗？

In C# or VB.NET. 在C＃或VB.NET中。

Answer 1

A file may begin with a byte order mark (called BOM) that helps the reader to detect what encoding was used. 文件可以以字节顺序标记（称为BOM）开头，该标记可以帮助读者检测所使用的编码。

The BOM for UTF8 is 3 bytes EF,BB,BF UTF8的BOM为3字节EF，BB，BF

For UTF16 (Encoding.Unicode) 2 bytes FEFF (encoded as either big endian or little endian depending on the encoding) 对于UTF16（Encoding.Unicode）2字节FEFF（根据编码方式编码为大端或小端）

For UTF32 4 bytes 0000FEFF 对于UTF32 4字节0000FEFF

获取以字节为单位的大小，当写入文件时，该字符串将占用多少？

问题描述

1 个解决方案

解决方案1
4 已采纳 2015-10-26 11:06:05

获取以字节为单位的大小，当写入文件时，该字符串将占用多少？

问题描述

1 个解决方案

解决方案1 4 已采纳 2015-10-26 11:06:05

解决方案1
4 已采纳 2015-10-26 11:06:05