简体   繁体   中英

How to determine how many bytes a BinaryWriter would take to write into memory?

I red: How many bytes will a string take up? and How to know the size of the string in bytes? and some others but I can't figure out the exact count of bytes that a string will take into memory using a BinaryWriter over a MemoryMappedViewStream over a MemoryMappedFile.

Sometimes the lenght taken is the string lenght + 1, sometimes it is the string lenght + 2???

I tried both:

  • System.Text.ASCIIEncoding.Default.GetByteCount(str)
  • System.Text.ASCIIEncoding.Unicode.GetByteCount(str)

But none of them works. I tried the string lenght plus a fixed amount but it does not works either.

If I check the difference between the BinaryWriter.BaseStream.Position before and after, then I can't figure out a way to determine what will be exact amount of bytes written for a string (position after - position before). It sounds like there is alignment or something else I can't figure out?

How to have the proper amount of bytes to write each time?

Update

I now use Encoding.UTF8.GetByteCount(str) + 1; , which give me almost the right size most of the time but not always.

I found my answer based on the source code of Microsoft BinaryWriter at https://referencesource.microsoft.com/#mscorlib/system/io/binarywriter.cs,08f2e8c389fd32df

Note: The lenght of the string is written before the string and is encoded on 7 bits where the size could vary depending on the string size (>= 128, >=2^14, > 2^21).

Code:

    public static int GetBinaryWriterSizeRequired(this string str, Encoding encoding)
    {
        encoding = encoding ?? Encoding.Default;

        int byteCount = encoding.GetByteCount(str);
        int byteCountRequiredToWriteTheSize = 1;

        // EO: This code is based on the Microsoft Source Code of the BinaryWriter at:
        // https://referencesource.microsoft.com/#mscorlib/system/io/binarywriter.cs,2daa1d14ff1877bd
        uint v = (uint)byteCount;   // support negative numbers
        while (v >= 0x80)
        {
            v >>= 7;
            byteCountRequiredToWriteTheSize++;
        }

        return byteCountRequiredToWriteTheSize + byteCount;
    }

The call:

...
_writer = new BinaryWriter(_stream);
_writerEncoding = _writer.GetPrivateFieldValue<Encoding>("_encoding");
...
int sizeRequired = name.GetBinaryWriterSizeRequired(_writerEncoding);

Others (I know we should not call private fields but I did it):

public static T GetPrivateFieldValue<T>(this object obj, string propName)
{
    if (obj == null)
        throw new ArgumentNullException("obj");

    Type t = obj.GetType();
    FieldInfo fi = null;
    while (fi == null && t != null)
    {
        fi = t.GetField(propName, BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance);
        t = t.BaseType;
    }

    if (fi == null)
        throw new ArgumentOutOfRangeException("propName", string.Format("Field {0} was not found in Type {1}", propName, obj.GetType().FullName));

    return (T)fi.GetValue(obj);
}

Just as reference, those are all bad:

return Encoding.UTF8.GetByteCount(str) + 1;
return System.Text.ASCIIEncoding.Default.GetByteCount(str) + 1; //  sizeof(int); // sizeof int to keep the size
return System.Text.ASCIIEncoding.Unicode.GetByteCount(str) + sizeof(int); // sizeof int to keep the size

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM