简体   繁体   中英

Why would C# mark all methods and types as System.Runtime.InteropServices.Charset.Ansi?

I was playing around with Visual Studio today, doing some C# to PInvoke some stuff from Win32 APIs. That is when I noticed this caption (please open in new tab to view full size)

视觉工作室

It reads:

Although the common language runtime default is System.Runtime.InteropServices.CharSet.Auto, languages may override this default. For example, by default C# marks all methods and types as System.Runtime.InteropServices.CharSet.Ansi.

Why ANSI? From reading Mark Russinovich's Windows Internals, I have read:

Because many applications deal with 8-bit (single-byte) ANSI character strings, many Windows functions that accept string parameters have two entry points: a Unicode (wide, 16-bit) version and an ANSI (narrow, 8-bit) version. If you call the narrow version of a Windows function, there is a slight performance impact as input string parameters are converted to Unicode before being processed by the system and output parameters are converted from Unicode to ANSI before being returned to the application.

So, am I understanding correctly that C#'s default when PInvoking unmanaged code is to accept that performance impact?

Edit:

So if I do something like:

[DllImport("kernel32.dll", Charset = CharSet.Auto]
public static extern bool Foo(IntPtr hHandle);

And let's say that inside kernel32.dll, there exists a FooA and a FooW ... how does C# know which entry point to use? The help text in Visual Studio makes me think that it'll choose the ANSI entry point by default, but we would prefer the wide version if a performance impact (however negligible) can be avoided.

Pinvoke isn't only used to call winapi functions. In fact it is the lesser use since the .NET Framework already wraps a large chunk of the winapi. Much more common is its use to call legacy custom C code. Well visible from the majority of questions about pinvoke on this site. The default of Charset.Ansi simply matches the default character type in the C language, char is an 8-bit type.

And yes, if you do use pinvoke to call an unwrapped winapi function then using CharSet.Auto is rather important to avoid data corruption and conversion overhead. The pinvoke marshaller is otherwise completely ignorant of it being a winapi function, the Windows DLLs that contain these functions are indistinguishable from a custom DLL. Do note that Auto itself stopped being relevant a while ago, the odds that your code will ever run on a machine that boots Windows 98 or ME today are vanishingly small.


Beware that your pinvoke declaration is not very meaningful and in general unwise. Only winapi functions that take a string argument or a pointer to a structure that contains a string requires the CharSet property. And you almost always actually declare the argument as String, StringBuffer or the struct type to allow the pinvoke marshaller to get it right. If you use IntPtr then the burden is you to generate the proper string, you'll have to use Marshal.StringToHGlobalAnsi/Auto/Uni explicitly. Or Marshal.StructureToPtr() with an appropriate [StructLayout] if it is a structure, which also has a CharSet property.

The pinvoke marshaller has built-in knowledge of winapi functions having an extra A or W after their name. It just tries to first find the function without the extra letter, then tries the A or W version. The EntryPoint property is available to disable that probing. This only happens once so there's not much point in using it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM