简体   繁体   English

如何在Java中解决UTF-8

[英]How to solve UTF-8 in java

I currently use 我目前正在使用

<%@ page language="java" contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>

in my jsp page. 在我的jsp页面中。

And when I get data from textbox using request.getParameter("..."); 当我使用request.getParameter("...");从文本框中获取数据时request.getParameter("..."); it retrieves data like that öÉ?É?É?öİ . 它检索类似öÉ?É?É?öİ I saw this problem when I used characters that are not english chars. 当我使用非英语字符时,我看到了这个问题。 I add URIEncoding="UTF-8" to server.xml in tomcat. 我在Tomcat中将URIEncoding="UTF-8"server.xml But it retrieved the same (öÉ?É?É?öİ) . 但是它检索到了相同的内容(öÉ?É?É?öİ) How to solve it? 怎么解决?

Thank you 谢谢

EDIT 编辑

Thanks for your answers. 谢谢你的回答。 I tried a few things, but nothing has fixed the problem. 我尝试了一些方法,但没有任何方法可以解决问题。

Here's what I've done: 这就是我所做的:

  • I added <Connector URIEncoding="UTF-8" .../> in server.xml. 我在server.xml中添加了<Connector URIEncoding="UTF-8" .../>

  • <meta ... charset=utf-8> tag is ok and I tried request.setCharacterEncoding("UTF-8"); <meta ... charset=utf-8>标签可以,并且我尝试了request.setCharacterEncoding("UTF-8");

  • I also tried <filter> tag in web.xml 我也在web.xml中尝试了<filter>标签

None of these actions fixes the problem. 这些操作都不能解决问题。 I'm wondering if there's something else wrong with this...(remembering: I used <form method='post'> . I click submit button and when I get data using request.getParameter("..") the format of this data is not the correct format. ) 我想知道这是否还有其他问题(记住:我使用了<form method='post'> 。我单击了提交按钮,当我使用request.getParameter("..")获取数据时,格式为此数据的格式不正确。)

You can try this code in your Servlet 您可以在Servlet中尝试此代码

if(request.getCharacterEncoding() == null) {
    request.setCharacterEncoding("UTF-8");
}

May be because the actual character encoding is not UTF-8 ? 可能是因为实际的字符编码不是UTF-8吗? If the characters itself are encoded in some other format then we just can't label them as UTF-8. 如果字符本身以其他格式编码,则我们不能将它们标记为UTF-8。

Try decoding them by giving various charset and see which one gives proper result. 尝试通过提供各种字符集来对它们进行解码,然后看看哪一个给出正确的结果。 I think the input character encoding is latin1(ISO-8859-1). 我认为输入字符编码为latin1(ISO-8859-1)。 If yes, follow below code 如果是,请遵循以下代码

String param1 = request.getParameter("...");
if(param1!=null)
{
  param1 = new String(param1.getBytes("ISO-8859-1"));
}

UTF 8 should be set at all the layers of the application. 应在应用程序的所有层上设置UTF 8。

Do following 跟随

1) HTML Code 1)HTML代码

 <meta contentType="text/html; charset="UTF-8"/>

2) Browser Setting for IE View -- Encoding -- Unicode (UTF-8) 2)IE视图的浏览器设置-编码-Unicode(UTF-8)

3) Tomcat Server server.xml - In Connector tag added "URIEncoding" attribute as 3)Tomcat服务器server.xml-在连接器标记中添加“ URIEncoding”属性为

<Connector port="8080" protocol="HTTP/1.1" 
           connectionTimeout="20000" 
           redirectPort="8443" URIEncoding="UTF-8"/>

catalina.sh/catalina.bat - added following catalina.sh/catalina.bat-添加以下

set JAVA_OPTS=--Xms256m -Xmx1024m -Xss268k -server -XX:MaxPermSize=256m -XX:-UseGCOverheadLimit -Djava.awt.headless=true -Djavax.servlet.request.encoding=UTF-8 -Dfile.encoding=UTF-8

set CATALINA_OPTS=-Dfile.encoding="UTF-8"

4) MIME type of response should be "application/x-www-form-urlencoded" 4)响应的MIME类型应为“ application / x-www-form-urlencoded”

There is another place you can check. 还有另一个地方可以检查。 Did you include following declaration in your JSP file? 您是否在JSP文件中包括以下声明?

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

I think the problem is that browser still sends requests using default ISO-8859-1 , which is the standard charset if not declared. 我认为问题在于浏览器仍然使用默认的ISO-8859-1发送请求,如果未声明,则为标准字符集。

You can also check the HTTP headers received from server to make sure the correct charset is sent back. 您还可以检查从服务器收到的HTTP标头,以确保发送回正确的字符集。

Essentially the cleanest way to do it is to use Unicode in your property files and/or code if need be (not adviced). 本质上,最清洁的方法是在属性文件和/或代码中使用Unicode (如果需要)(不建议)。

This way you avoid all encoding issues, since your programm only has deal with ASCII code, the proper reprenstation is then handeled entierly by the client side and you do not have to worry about the standard os encoding or enviorment encoding. 这样,您可以避免所有编码问题,因为您的程序仅处理ASCII码,然后客户端就可以很好地处理适当的表示,并且您不必担心标准os编码或环境编码。

You can also try adding the following filter at web.xml: 您也可以尝试在web.xml中添加以下过滤器:

<filter>
 <filter-name>Character Encoding Filter</filter-name>
 <filter-class>org.apache.catalina.filters.SetCharacterEncodingFilter</filter-class>
  <init-param>
   <param-name>encoding</param-name>
   <param-value>UTF-8</param-value>
  </init-param>
</filter>

Hope this help 希望这有帮助

You should try it 你应该试试看

String content= request.getParameter("content");
if(content!=null)
  content = new String(content.getBytes("ISO-8859-1"));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM