文本编码可在Play中转换垃圾字符！ 1.2.4框架

Question

Issue: Character encoding in Play! 问题： Play中的字符编码！ 1.2.4 framework becomes. 1.2.4框架成为。

Context: We are trying to store the text "《我叫MT繁體版》台港澳專屬伺服器上線！" from input text field to mysql using Play! 上下文：我们正在尝试使用Play将文本“《我叫MT繁体版》台港澳专属伺服器上线！”存储在mysql中。 1.2.4 framework. 1.2.4框架。

Steps that we followed: 我们遵循的步骤：

1) UI to get the input from user. 1）从用户那里获取输入的UI。 just any lang text, so we tried Japneese Char. lang语言文字，所以我们尝试了Japneese Char。 Note: page is set to UTF-8 character encoding. 注意：页面设置为UTF-8字符编码。

2) Post submission to Play! 2）发布提交后即可玩！ controller, the controller just reads the input and stores it using Play! 控制器，控制器仅读取输入并使用Play进行存储！ model. 模型。 snippet mentiond below, 下面提到的摘录

public static void text_create() throws UnsupportedEncodingException,
        ParseException {
    System.out.println("params :: text string value :: "    + params.get("text"));

    String oldString = params.get("text");

    // Converting the input string(which is UTF-8 format) and parsing to Windown-1252
    String newString = new String(oldString.getBytes(), "WINDOWS-1252");        

    // 1. passing encoded text to mysql. 
    // 2. TextCheck table and the column 'text' has encoding and collation format as UTF-8.
    // 3. TextCheck > text column mentioned as String in model.
    TextCheck a = new TextCheck(newString);

    List<Object> text = TextCheck.TextList();
    render(a,text);
}

It stores as TEXT value as "ã€Šæˆ'å «MTç¹ é«”ç‰ˆã€‹å °æ¸¯æ¾³å°ˆå±¬ä¼ºæœ å™¨ä¸Šç·šï¼ " 它以TEXT值的形式存储为“ã€ææˆ'å。«MTç¹Ãé«”ç‰ˆã€‹åæ°æ¸¯æ¾³å°ˆå±¬ä¼ºæœå™¨ä¸Šç·šï¼。”

Problem is there are character in between value. 问题在于值之间存在字符。 when i read this raw data from mysql using other platforms like java, ruby or some other language it converts but makes those characters as junk. 当我使用java，ruby或其他语言从其他平台从mysql读取原始数据时，它会转换但会将那些字符变成垃圾。 just junk. 只是垃圾

Note: Interstingly when i read it from same Play! 注意：有趣的是，当我从同一个Play中阅读它时！ framework. 框架。 it looks all fine even that junk characters were read correctly. 即使正确读取了垃圾字符，看起来也很好。

Question: Why those junk characters ? 问题：为什么那些垃圾字符？

Answer 1

The problem is the following line: 问题是以下行：

String newString = new String(oldString.getBytes(), "WINDOWS-1252");

This looks like nonsense to me. 对我来说这似乎是胡说八道。 Java stores all strings internally using UTF-16, so you can't adjust the encoding of a Java string in the manner you've attempted here. Java使用UTF-16在内部存储所有字符串，因此您无法以此处尝试的方式调整Java字符串的编码。

The getBytes() method returns the bytes of the string using the default platform encoding. getBytes()方法使用默认平台编码返回字符串的字节。 You then covert these bytes into a new string using a (probably) different charset. 然后，您可以使用（可能）不同的字符集将这些字节转换为新的字符串。 The result is almost certain to be broken. 结果几乎可以肯定会被打破。

文本编码可在Play中转换垃圾字符！ 1.2.4框架

问题描述

1 个解决方案

解决方案1
1 2013-06-06 07:25:35

文本编码可在Play中转换垃圾字符！ 1.2.4框架

问题描述

1 个解决方案

解决方案1 1 2013-06-06 07:25:35

解决方案1
1 2013-06-06 07:25:35