从Websphere 7升级到Websphere 8.5.5时的编码问题

Question

We recently moved an application from WAS 7.0 (on AIX) to WAS 8.5.5 (on Linux). 我们最近将应用程序从WAS 7.0（在AIX上）迁移到WAS 8.5.5（在Linux上）。 It interfaces with a couple of applications that send data in the form of an xml 它与几个应用程序接口，这些应用程序以xml的形式发送数据

The XML is retrieved from the header using - 使用以下命令从标题中检索XML：

while ((i = request.getReader().read(buf, 0, buf.length)) != -1) {
            sb.append(buf, 0, i);
        }

However after transition, we noticed that the application was not handling special characters like è or © correctly - they are garbled. 但是，转换后，我们注意到该应用程序未正确处理è或©等特殊字符-它们乱码。

This looks to me like an encoding issue. 在我看来，这似乎是一个编码问题。 Can anyone point on what needs to be checked to understand the root cause? 任何人都可以指出需要检查哪些内容以了解根本原因吗？

I was reading further on this and i see that I can set the JVM arguments to 我正在进一步阅读，我发现可以将JVM参数设置为

-Dclient.encoding.override=UTF-8

to always use UTF-8. 始终使用UTF-8。 Is this a good practice? 这是一个好习惯吗？

Edit : 编辑：

Locale output in Linux

LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Locale output on AIX
LANG=en_US
LC_COLLATE="en_US"
LC_CTYPE="en_US"
LC_MONETARY="en_US"
LC_NUMERIC="en_US"
LC_TIME="en_US"
LC_MESSAGES="en_US"
LC_ALL=

One application sends the xml as <?xml version="1.0" encoding="ISO-8859-1"?> and the other sends it as <?xml version="1.0"> 一个应用程序以<?xml version="1.0" encoding="ISO-8859-1"?>发送xml，另一个应用程序以<?xml version="1.0">

After setting the above mentioned JVM setting, the <?xml version="1.0"> is treated correctly but the one with the encoding set to ISO-8859-1 is not. 设置上述JVM设置后，将正确处理<?xml version="1.0"> ，但不会正确处理编码设置为ISO-8859-1的<?xml version="1.0"> 。 I am totally lost here. 我在这里完全迷路了。

Answer 1

As it seems you application is not written to use a specific encoding and therefore uses the default of your session. 看起来您的应用程序未编写为使用特定的编码，因此使用了会话的默认设置。

Check on AIX and Linux the locale with locale . 使用locale在AIX和Linux上检查语言locale 。 On Linux it might be something like LANG=en_US.UTF-8 . 在Linux上，可能类似于LANG=en_US.UTF-8 。

To let your application behave on Linux the same as on AIX set on Linux the locale to the same value as on AIX. 为了使您的应用程序在Linux上与在AIX上相同，在Linux上将语言环境设置为与AIX上相同的值。

Using unicode aware applications is not a bad idea in general. 通常，使用具有Unicode意识的应用程序不是一个坏主意。 But there are exceptions where you need to stick to another encoding, eg LATIN-1 for some legacy systems. 但是在某些情况下，您需要坚持使用另一种编码，例如对于某些旧系统，使用LATIN-1。 Then in your code you explicitly need to choose this encoding where it is needed. 然后，在您的代码中，您明确需要在需要的地方选择此编码。

从Websphere 7升级到Websphere 8.5.5时的编码问题

问题描述

1 个解决方案

解决方案1
0 2015-12-10 11:06:29

从Websphere 7升级到Websphere 8.5.5时的编码问题

问题描述

1 个解决方案

解决方案1 0 2015-12-10 11:06:29

解决方案1
0 2015-12-10 11:06:29