简体繁体 English

MFC CEdit 将非 ascii 字符转换为 ascii

[英]MFC CEdit converts non-ascii characters to ascii

原文 2019-05-22 11:04:43 7 2 c++/ visual-c++/ unicode/ mfc/ mbcs

We have an MFC Windows Application, written originally in VC++ 6 and over the years updated for newer IDE, currently developed in VS2017.我们有一个 MFC Windows 应用程序，最初是用 VC++ 6 编写的，多年来针对更新的 IDE 进行了更新，目前在 VS2017 中开发。

The application is built with MBCS (not unicode).该应用程序是用 MBCS（不是 unicode）构建的。 Trying to switch to Unicode causes 3806 compile errors, and that is probably just a tip of an iceberg.尝试切换到 Unicode 会导致 3806 编译错误，这可能只是冰山一角。

However we want to be able to run the application with different code page, ie.但是，我们希望能够使用不同的代码页运行应用程序，即。 1250 (Central European). 1250（中欧）。

I tried to build a small test application, and managed to get it to work with special characters (čćšđž).我尝试构建一个小型测试应用程序，并设法让它与特殊字符 (čćšđž) 一起工作。 I did this by setting dialog font to Microsoft Sans Serif with code page 1250. The same approach in our application does not work.我通过使用代码页 1250 将对话框字体设置为 Microsoft Sans Serif 来做到这一点。我们的应用程序中的相同方法不起作用。 Note: dialogs in our application are created dynamically, and font is set using SetFont.注意：我们应用程序中的对话框是动态创建的，字体是使用 SetFont 设置的。

There is a difference how the special characters are treated in these two applications.在这两个应用程序中处理特殊字符的方式有所不同。

In test application, the special characters are displayed in the edit control, and GetWindowsText retrieves the right bytes.在测试应用程序中，特殊字符显示在编辑控件中，GetWindowsText 检索正确的字节。 However, trying to write some characters from other languages, renders them as "????".但是，尝试从其他语言编写一些字符时，会将它们呈现为“????”。
In our application, all special characters are rendered properly, but GetWindowText (or WM_GETTEXT) convert the special characters to the similar ascii counterpart (čćđ -> ccd).在我们的应用程序中，所有特殊字符都被正确渲染，但 GetWindowText（或 WM_GETTEXT）将特殊字符转换为类似的 ascii 对应物（čćđ -> ccd）。

I believe that Edit control in our application displays Unicode text, but GetWindowText converts it to ascii.我相信我们应用程序中的 Edit 控件显示 Unicode 文本，但 GetWindowText 将其转换为 ascii。

Does anyone have any idea what is happening here, and how I might solve it?有谁知道这里发生了什么，以及我该如何解决？

Note: I know how to convert project to Unicode.注意：我知道如何将项目转换为 Unicode。 We are choosing not to commit resources to it at the moment, as it would probably take weeks or months to implement.我们目前选择不为此投入资源，因为它可能需要数周或数月才能实施。 The question is how I might get it to work with MBSC and why is edit control converting Č to C.问题是我如何让它与 MBSC 一起工作，以及为什么编辑控件将 Č 转换为 C。

2 个解决方案

I believe it is absolutely possible to port the application to other languages/codepages, you only need to modify the .rc (resource) files, basically having one resource file for each language, which you may rather want to do anyway, as strings in menus and/or string-tables would be in a different language. 我相信绝对有可能将应用程序移植到其他语言/代码页，您只需要修改.rc（资源）文件，基本上每种语言都有一个资源文件，无论如何您都希望这样做，例如字符串菜单和/或字符串表将使用其他语言。 And this is actually the only change needed, as far as the application part is concerned. 就应用程序部分而言，这实际上是唯一需要的更改。

The other part is the system you are running it on. 另一部分是您正在其上运行的系统。 A window can be unicode or non-unicode. 窗口可以是unicode或非unicode。 You can see this with the Spyxx utility, it tells you whether a window (procedure) is unicode or not (Window properties, General tab). 您可以使用Spyxx实用程序看到此信息，它告诉您窗口（过程）是否为unicode（“窗口属性”，“常规”选项卡）。 And while unicode windows do work properly, non-unicode ones have to change encoding from/to unicode and mbcs when getting or setting the text. 尽管unicode窗口可以正常工作，但非unicode窗口必须在获取或设置文本时将编码更改为unicode和mbcs。 The conversion is based on the system (default) code-page . 转换基于系统（默认）代码页 。 This can only be set globally (for the whole machine), and not per application or window. 只能全局设置（针对整个计算机），而不能按每个应用程序或窗口设置。 And of course, setting the font's codepage is not enough (and imo it's not needed at all, if you are runnign the application on a machine with the "correct" codepage). 当然，仅设置字体的代码页是不够的（如果您在使用“正确”代码页的计算机上运行应用程序，则根本不需要imo）。 That is, for non-unicode applications, only one codepage will be working properly, the others won't. 也就是说，对于非Unicode应用程序，只有一个代码页可以正常工作，而其他代码页则无法正常工作。

I can see two options: 我可以看到两个选项：

If you only need to update a small number of controls, it may be possible to change only these controls to unicode, and use the "wide" versions of the get/set window-test functions or messages - you will have to convert the text between unicode and your desired codepage. 如果您只需要更新少量控件，则可以仅将这些控件更改为unicode，并使用“获取/设置”窗口测试功能或消息的“宽”版本-您将必须转换文本在unicode和所需的代码页之间。 It requires writing some code, but has the advantage of the conversion being independent from the system default codepage, eg you can have the codepage in some configuration file, in the registry, or as a command-line option (in the application's shortcut). 它需要编写一些代码，但具有转换独立于系统默认代码页的优点，例如，您可以将代码页保存在某些配置文件中，注册表中或作为命令行选项（在应用程序的快捷方式中）。 Some control types can be changed to unicode, some others not, so pls check the documentation. 某些控件类型可以更改为unicode，而另一些则不能，因此请检查文档。 Used this technique successfully for a mbcs application displaying/editing translated strings in many different languages, but I only had one control, a List-View, which btw offers the LVM_SETUNICODEFORMAT message, thus allowing for unicode texts, even in a mbcs application. 这项技术成功地用于mbcs应用程序，它以多种不同的语言显示/编辑翻译后的字符串，但是我只有一个控件，即List-View，btw提供LVM_SETUNICODEFORMAT消息，因此即使在mbcs应用程序中也可以使用unicode文本。
The easiest method is simply run the application as is, but it will only be working on machines with the proper default codepage, as most non-unicode applications do. 最简单的方法是直接按原样运行应用程序，但它只能在具有适当默认代码页的计算机上工作，就像大多数非Unicode应用程序一样。

The system default codepage can be changed by setting the "Language for non-Unicode programs" option, available in the regional settings, Administrative tab, and requires a reboot. 可以通过设置区域设置“管理”选项卡中的“非Unicode程序的语言”选项来更改系统默认代码页，并且需要重新启动。 Changing the Windows UI language will change this option as well, but by setting this option you don't need to change the UI language, eg you can have English UI and East-European codepage. 更改Windows UI语言也会更改此选项，但是通过设置此选项，您无需更改UI语言，例如，您可以拥有英文UI和东欧代码页。

See a very similar post here . 在这里看到非常相似的帖子。

Late to the party:晚会迟到：

In our application, all special characters are rendered properly, but GetWindowText (or WM_GETTEXT) convert the special characters to the similar ascii counterpart (čćđ -> ccd).在我们的应用程序中，所有特殊字符都被正确渲染，但 GetWindowText（或 WM_GETTEXT）将特殊字符转换为类似的 ascii 对应物（čćđ -> ccd）。

That sounds like the ES_OEMCONVERT flag has been set for the control:这听起来像是为控件设置了ES_OEMCONVERT标志：

Converts text entered in the edit control.转换在编辑控件中输入的文本。 The text is converted from the Windows character set to the OEM character set and then back to the Windows character set.文本从 Windows 字符集转换为 OEM 字符集，然后再转换回 Windows 字符集。 This ensures proper character conversion when the application calls the CharToOem function to convert a Windows string in the edit control to OEM characters.当应用程序调用 CharToOem 函数将编辑控件中的 Windows 字符串转换为 OEM 字符时，这可确保正确的字符转换。 This style is most useful for edit controls that contain file names that will be used on file systems that do not support Unicode.此样式对于包含将在不支持 Unicode 的文件系统上使用的文件名的编辑控件最有用。
To change this style after the control has been created, use SetWindowLong.若要在创建控件后更改此样式，请使用 SetWindowLong。