简体   繁体   中英

Java program unable to print Hindi , Gujrati from MySQL in Ubuntu

I am facing some challenge while printing Gujrati or Hindi using Java (tomcat server) , MySQL combination in Ubuntu . I have to produce some html format using Java from MySQL DB which will be displayed through browser. Same will also be printed in pdf using wkhtmltopdf . Although I could enter data in the table in Gujarati through MySQL workbench , unfortunately Java is printing it as ????? .

I have done the following :

1) Altered the text column of corresponding MySQL table adding

CHARACTER SET utf8 COLLATE utf8_unicode_ci;

Hence it can store the Gujarati / Hindi text properly.

2) In the jdbc url , I have added

useUnicode=true&characterEncoding=utf8

At MySQL level I have applied

SET character_set_server=utf8mb4;

3) In the java code I have applied

System.setProperty("file.encoding", "UTF-8");

It is still returning ????? . Please let me know what else is required to fetch Gujrati character from MySQL database using Java in Ubuntu and display it through browser .

Thanks in advance for your help.

useUnicode=true&characterEncoding=utf8

-->

useUnicode=yes&characterEncoding=UTF-8

You say the column is now set to "CHARACTER SET utf8 COLLATE utf8_unicode_ci;". Was the INSERT done after the ALTER ? If it was before, then nothing can fix the question marks.

It could be solved finally . I kept a simple test.html file containing the Gujrati character in jsp folder of the tomcat server . Even that could not be displayed properly in browser . The same html file was saved as test.jsp which also could not display the characters . So this hinted that it was not an issue of Java-MySQL combination as thought earlier .

In the same ubuntu server we have php server . From sites hosted in that PHP server, this simple html page could be viewed properly when invoked through same browser . This gave the clue that there is no change required at Ubuntu level but some configuration is needed at the tomcat server level.

The way it was resolved is as mentioned below .

1) At the servlet level I put the following two lines :

response.setContentType("text/html; charset=UTF-8");
 response.setCharacterEncoding("UTF-8");

2) For jsp page put :

 <%@page pageEncoding="UTF-8" contentType="text/html; charset=UTF-8"%>


  In program generated html page added the following tag

 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

3) In server.xml of tomcat put URIEncoding="UTF-8" in Connector element .

<Connector port="8082" protocol="HTTP/1.1"
               connectionTimeout="20000"
               redirectPort="8444"
               URIEncoding="UTF-8"/>

4) In web.xml I put the following for JSP page

      <jsp-config>
                <jsp-property-group>
                   <url-pattern>*.*</url-pattern>
                   <page-encoding>UTF-8</page-encoding>
                </jsp-property-group>
         </jsp-config>

So that whatever is put in jsp folder (jsp or html page) , can display unicode characters. After putting this the aforesaid test.html , test.jsp could display the characters properly . However , servlet was still not able to display the characters . So the below mentioned steps were applied .

5) As advised in some discussion page , I applied a java filter as specified and added corresponding tags in web.xml as shown below .

import java.io.IOException;
 import javax.servlet.Filter;
 import javax.servlet.FilterChain;
 import javax.servlet.FilterConfig;
 import javax.servlet.ServletException;
 import javax.servlet.ServletRequest;
 import javax.servlet.ServletResponse;

 public class CharsetFilter implements Filter{
            private String encoding;

            public void init(FilterConfig config) throws ServletException{
                 encoding = config.getInitParameter("requestEncoding");
                 if( encoding==null ) encoding="UTF-8";
            }

            public void doFilter(ServletRequest request, ServletResponse response
                      , FilterChain next)  throws IOException, ServletException{           

                   if(null == request.getCharacterEncoding())
                      request.setCharacterEncoding(encoding);             
                   response.setContentType("text/html; charset=UTF-8");
                   response.setCharacterEncoding("UTF-8");
                   next.doFilter(request, response);
            }

            public void destroy(){}
 }   

Then added following tags in web.xml :

 <filter>

           <filter-name>CharsetFilter</filter-name>
           <filter-class>CharsetFilter</filter-class>

           <init-param>
                <param-name>encoding</param-name>
                <param-value>UTF-8</param-value>
           </init-param>

 </filter>

 <filter-mapping>
            <filter-name>CharsetFilter</filter-name>
            <url-pattern>/*</url-pattern>
 </filter-mapping>

After applying this , the servlet (which was sending the html generated from MySQL by the java code) can now display Gujarati / Hindi characters in browser . I believe same technique is applicable for any such languages .

Following discussion links helped me to resolve the issue .

https://wiki.duraspace.org/pages/viewpage.action?pageId=34638116

How to get UTF-8 working in Java webapps?

UtF-8 format not working in servlet for tomcat server

https://dertompson.com/2007/01/29/encoding-filter-for-java-web-applications/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM