简体   繁体   中英

How to solve UTF-8 in java

I currently use

<%@ page language="java" contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>

in my jsp page.

And when I get data from textbox using request.getParameter("..."); it retrieves data like that öÉ?É?É?öİ . I saw this problem when I used characters that are not english chars. I add URIEncoding="UTF-8" to server.xml in tomcat. But it retrieved the same (öÉ?É?É?öİ) . How to solve it?

Thank you

EDIT

Thanks for your answers. I tried a few things, but nothing has fixed the problem.

Here's what I've done:

  • I added <Connector URIEncoding="UTF-8" .../> in server.xml.

  • <meta ... charset=utf-8> tag is ok and I tried request.setCharacterEncoding("UTF-8");

  • I also tried <filter> tag in web.xml

None of these actions fixes the problem. I'm wondering if there's something else wrong with this...(remembering: I used <form method='post'> . I click submit button and when I get data using request.getParameter("..") the format of this data is not the correct format. )

You can try this code in your Servlet

if(request.getCharacterEncoding() == null) {
    request.setCharacterEncoding("UTF-8");
}

May be because the actual character encoding is not UTF-8 ? If the characters itself are encoded in some other format then we just can't label them as UTF-8.

Try decoding them by giving various charset and see which one gives proper result. I think the input character encoding is latin1(ISO-8859-1). If yes, follow below code

String param1 = request.getParameter("...");
if(param1!=null)
{
  param1 = new String(param1.getBytes("ISO-8859-1"));
}

UTF 8 should be set at all the layers of the application.

Do following

1) HTML Code

 <meta contentType="text/html; charset="UTF-8"/>

2) Browser Setting for IE View -- Encoding -- Unicode (UTF-8)

3) Tomcat Server server.xml - In Connector tag added "URIEncoding" attribute as

<Connector port="8080" protocol="HTTP/1.1" 
           connectionTimeout="20000" 
           redirectPort="8443" URIEncoding="UTF-8"/>

catalina.sh/catalina.bat - added following

set JAVA_OPTS=--Xms256m -Xmx1024m -Xss268k -server -XX:MaxPermSize=256m -XX:-UseGCOverheadLimit -Djava.awt.headless=true -Djavax.servlet.request.encoding=UTF-8 -Dfile.encoding=UTF-8

set CATALINA_OPTS=-Dfile.encoding="UTF-8"

4) MIME type of response should be "application/x-www-form-urlencoded"

There is another place you can check. Did you include following declaration in your JSP file?

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

I think the problem is that browser still sends requests using default ISO-8859-1 , which is the standard charset if not declared.

You can also check the HTTP headers received from server to make sure the correct charset is sent back.

Essentially the cleanest way to do it is to use Unicode in your property files and/or code if need be (not adviced).

This way you avoid all encoding issues, since your programm only has deal with ASCII code, the proper reprenstation is then handeled entierly by the client side and you do not have to worry about the standard os encoding or enviorment encoding.

You can also try adding the following filter at web.xml:

<filter>
 <filter-name>Character Encoding Filter</filter-name>
 <filter-class>org.apache.catalina.filters.SetCharacterEncodingFilter</filter-class>
  <init-param>
   <param-name>encoding</param-name>
   <param-value>UTF-8</param-value>
  </init-param>
</filter>

Hope this help

You should try it

String content= request.getParameter("content");
if(content!=null)
  content = new String(content.getBytes("ISO-8859-1"));

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM