简体   繁体   English

将C++ dll中的const char*正确转换为C#中的string

[英]Correctly Convert const char* in C++ dll to string in C#

I'm writing a program that passes a const char* from my C++ dll into my C# code as a string.我正在编写一个程序,将我的C++ dll 中的 const char*作为字符串传递到我的C#代码中。 Certain characters don't pass the way I intend, which interferes with processing the string afterward.某些字符没有按我预期的方式传递,这会干扰以后处理字符串。

For example, "ß.\x3" in C++ becomes "ß®\x3" when it reaches my C# program.例如C#中的"ß.\x3"到达我的C++程序时就变成了"ß®\x3" In another case, "(\x2\x2" becomes "Ȩ\x2" . I believe this may be a marshaling issue, but am not entirely sure.在另一种情况下, "(\x2\x2"变成"Ȩ\x2" 。我相信这可能是一个编组问题,但我不完全确定。

Below is some relevant code:下面是一些相关的代码:

C++ code: C++代码:

typedef void (__stdcall * OnPlayerTextMessageReceivedCallback)(const char* entityId, const char* textMessage);

void
ProcessTextMessage(
    const std::string& sender,
    const std::string& message
    )
{
    m_onPlayerTextMessageReceivedCallback(sender.c_str(), message.c_str());
}

C# code: C#代码:

private delegate void OnPlayerTextMessageReceivedCallback(
            [MarshalAs(UnmanagedType.LPStr)] string senderEntityId,
            [MarshalAs(UnmanagedType.LPStr)] string message
            );

I tried using marshaling the values with LPStr and LPWStr, but am still running into the same issues.我尝试使用 LPStr 和 LPWStr 对值进行封送处理,但仍然遇到相同的问题。

I appreciate any insight on what's happening here.我很欣赏对这里发生的事情的任何见解。

The c_str() function returns the plain pointer to the char data - that is not the problem. c_str() function 返回指向 char 数据的普通指针——这不是问题所在。 I assume both sides use different encodings.我假设双方使用不同的编码。 I would recommend to use utf-8. The do.net marsheller converts the string by/to the default system code page for LPStr (eg cp1252) - not UTF8.我建议使用 utf-8。do.net marsheller 将字符串转换为 LPStr 的默认系统代码页(例如 cp1252)- 而不是 UTF8。 Best would be to write it without magic do.net marshalling.最好是在没有魔术 do.net 编组的情况下编写它。

Sample csharp code:示例 csharp 代码:

using System;

OnPlayerTextMessageReceivedCallback del = new Receiver().Receive;

//c++ emul
del("Hello".ToUtf8(), "World".ToUtf8());

public delegate void OnPlayerTextMessageReceivedCallback(
    IntPtr senderEntityId,
    IntPtr message
);

class Receiver
{
    public void Receive(IntPtr senderEntityId, IntPtr message)
    {
        Console.WriteLine(senderEntityId.FromUtf8());
        Console.WriteLine(message.FromUtf8());
    }
}

public static class Utf8Util
{
    public static unsafe string FromUtf8(this IntPtr p)
    {
        int len = 0;
        Span<byte> sourceBytes = new(p.ToPointer(), int.MaxValue);
        while (true)
        {
            var b = sourceBytes[len];
            if (b == 0)
            {
                break;
            }
            else
            {
                len++;
            }
        }
        sourceBytes = sourceBytes.Slice(0, len);
        return System.Text.Encoding.UTF8.GetString(sourceBytes);
    }

    public static unsafe IntPtr ToUtf8(this string s)
    {
        var data = System.Text.Encoding.UTF8.GetBytes(s);
        return new IntPtr(System.Runtime.CompilerServices.Unsafe.AsPointer(ref data[0]));
    }
}

In C++ you should use a single encoding for all strings eg utf8.在 C++ 中,您应该对所有字符串使用单一编码,例如 utf8。 No "default code page".没有“默认代码页”。

Todo so you can write in C++: Todo 所以你可以写在 C++:

std::string myText = u8"This is a string in Utf8 encoding!";

For external string data you should convert it to your internal encoding.对于外部字符串数据,您应该将其转换为内部编码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM