简体   繁体   English

使用COM将字符串从C#传递到cpp

[英]Passing a string from C# to cpp with COM

I have a C# COM server which is consumed by a cpp client. 我有一个C#COM服务器,由cpp客户端使用。

One of the C# methods returns a string. 其中一个C#方法返回一个字符串。

In cpp the returned string is represented in Unicode (UTF-16), at least according to the memory view. 在cpp中,返回的字符串以Unicode(UTF-16)表示,至少根据内存视图。

  1. Is this always the case with COM strings? COM字符串总是这样吗?
  2. Is there a way to use UTF-8 instead? 有没有办法使用UTF-8代替?
  3. I saw some code where strings were passed between cpp and c# as byte arrays. 我看到一些代码,其中字符串在cpp和c#之间传递为字节数组。 Is there any benefit in this? 这有什么好处吗?
  1. Yes. 是。 The standard COM string type is BSTR. 标准COM字符串类型是BSTR。 It is a Unicode string encoded in UTF16, just like Windows' native string type. 它是以UTF16编码的Unicode字符串,就像Windows的本机字符串类型一样。
  2. No, a COM method isn't going to understand a UTF8 string, it will turn it into Chinese. 不,COM方法不会理解UTF8字符串,它会把它变成中文。 UTF8 is a good encoding for a text file, not for programs manipulating strings in memory. UTF8是一个很好的文本文件编码,不适用于操作内存中字符串的程序。 UTF8 requires anywhere between 1 and 4 bytes to encode a Unicode codepoint. UTF8需要1到4个字节之间的任何值来编码Unicode代码点。 Very incompatible with basic string manipulations like getting the size or indexing a character. 与基本字符串操作非常不兼容,例如获取大小或索引字符。
  3. C and C++ programs tend to use 8-bit encodings, compatible with the "char" type. C和C ++程序倾向于使用8位编码,与“char”类型兼容。 That's an old practice, dating back from an era before Unicode was around. 这是一种古老的做法,可以追溯到Unicode出现之前的一个时代。 There's nothing attractive about it, there are many 8-bit encodings. 它没什么吸引力,有许多 8位编码。 The typical problem is that data entered as text can only be interpreted correctly if it is read by a program that uses the same 8-bit encoding. 典型的问题是,如果由使用相同8位编码的程序读取,则只能正确解释作为文本输入的数据。 In other words, when the computers are less than 1000 miles apart. 换句话说,当计算机相距不到1000英里时。 Less in Europe. 在欧洲较少。
  1. No. 没有。
  2. Yes. 是。 Put the attribute [return: MarshalAs(UnmanagedType.LPStr)] before the method definition in C# if you'd like to return the string as an ANSI string instead of Unicode. 如果要将字符串作为ANSI字符串而不是Unicode返回, 请将属性 [return: MarshalAs(UnmanagedType.LPStr)]放在C#中的方法定义之前。
  3. Yeah--the author may have done that to maintain very fine-grained control on the encoding of the contents of the string by side-stepping the default marshalling behavior. 是的 - 作者可能已经这样做,通过侧面步进默认的编组行为来保持对字符串内容编码的非常细粒度的控制。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM