简体   繁体   English

当我从Javascript向Java applet传递字符串时,字符串发生了奇怪的变化

[英]Strings are changed strangely when I pass a string from Javascript to Java applet

I'm making a web IRC client using Javascript and Java applet (for socket. I used Flash before, but since it has strict security restrictions, servers it can connect to is restricted. So I started to use Java applet, which I haven't used before, so I had many problems with that such as using <applet> , compiling the applet, and signing jar.), and I have a strange phenomenon. 我正在使用Javascript和Java Applet(用于套接字制作Web IRC客户端。我以前使用Flash,但是由于它具有严格的安全性限制,因此可以连接的服务器受到限制。因此,我开始使用Java Applet,而我还没有t之前使用过,所以我遇到了很多问题,例如使用<applet> ,编译applet和对jar签名。),我有一个奇怪的现象。

When a string is passed from Javascript to Java applet (like irc.sendLine("foobar") ), sometimes characters which code is 65533( ) or 127 , and sometimes other things like 110 or number in ASCII ( ( ) is padded. It could be the encoding problem, but I think it's not, because both the charset used in Java and HTML page are UTF-8 , and it happens even when the string consists of only alphanumeric. 当将字符串从Javascript传递到Java applet(例如irc.sendLine("foobar") )时,有时会65533( )代码为65533( )127字符,有时还会填充其他字符,例如110或ASCII中的数字( ( ))。这可能是编码问题,但我认为不是,因为Java和HTML页面中使用的字符集均为UTF-8 ,即使字符串仅包含字母数字,也会发生这种情况。

More strangely, it happens only in Google Chrome. 更奇怪的是,它仅在Google Chrome中发生。 In Firefox, there's no padding (which is OK.) 在Firefox中,没有填充(可以)。

I modified my Java applet code for debugging the problem. 我修改了Java小程序代码以调试问题。

Below is a part of my code ( traceStr prints a string to the Javascript console) 以下是我的代码的一部分( traceStr将字符串输出到Javascript控制台)

public void sendLine(String s){
    traceStr(dumpStr(s));
}
private String dumpStr(String s){
    String result = "";
    for(int i=0;i<s.length();i++){
       result += s.codePointAt(i);
        if(i<s.length()-1) result += " ";
    }
    return result;
}

and sendLine is invoked using JS console, like irc.sendLine("foobar") . sendLine使用JS控制台,像调用irc.sendLine("foobar")

Here's some output (adding 0 repeatedly) 这是一些输出(重复添加0

48 40 65533
48 48 65533 65533 65533 127
48 48 48 65533 65533 127
48 48 48 48 65533 127
48 48 48 48 48 127
48 48 48 48 48 48
48 48 48 48 48 48 48
48 48 48 48 48 48 48 48
48 48 48 48 48 48 48 48 48
48 48 48 48 48 48 48 48 48 48
48 48 48 48 48 48 48 48 48 48 48
48 48 48 48 48 48 48 48 48 48 48 48
48 48 48 48 48 48 48 48 48 48 48 48 48
48 48 48 48 48 48 48 48 48 48 48 48 48 48
48 48 48 48 48 48 48 48 48 48 48 48 48 48 48
48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 99 111 110 110 101 99 116
48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 65533 65533 65533
48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 65533 65533 65533 127
48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 65533 65533 127
48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 65533 127
48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 127

Output for The quick brown fox : The quick brown fox输出:

84 104 101 32 113 117 105 99 107 32 98 114 111 119 110 32 102 111 120 65533 65533 127

Output for , 天1 , 天11 , 天111 , and 天地 : 天1天11天111天地

22825 65533 65533 127
22825 49 65533 127
22825 49 49 127
22825 49 49 49
22825 22320

I'm using Google Chrome 17.0.932.0 and Java 1.6.0_23 on Ubuntu 11.10. 我在Ubuntu 11.10上使用Google Chrome 17.0.932.0和Java 1.6.0_23。 This didn't happen when I used Flash. 当我使用Flash时,这没有发生。 Can anyone identify what I did wrong? 谁能确定我做错了什么? From the outputs I guess that there's something wrong related to UTF-8 , but I don't know anymore... 从输出中我猜想与UTF-8有关的东西出了问题,但我不知道了...

BTW, Many answers of questions like this I found on SO mentions ISO-8859-1 , but is it related to this problem? 顺便说一句,我在SO上发现的类似问题的许多答案都提到了ISO-8859-1 ,但这与这个问题有关吗?

JavaScript is UTF-16. JavaScript是UTF-16。 So conversion from UTF-16 to UTF-8 could be happening with unintended side effects. 因此,从UTF-16到UTF-8的转换可能会发生意外的副作用。 Escpecialy for characters above 127 decimal. Escpecialy用于127以上的十进制字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM