简体   繁体   English

使用MySQL和JSP的Urdu(UTF-8)数据

[英]Urdu (UTF-8) data using MySQL and JSP

This problem is little bit odd. 这个问题有点奇怪。 All UTF-8 requirement of MYSQL and JSP are fully justified in my code. 我的代码充分证明了MYSQL和JSP的所有UTF-8要求。 I have two simple files input.jsp (for taking input) and NewFile.jsp (for retrieving input from database). 我有两个简单的文件input.jsp(用于获取输入)和NewFile.jsp(用于从数据库检索输入)。 The database QASKU.production is already created and loaded with UTF8 data and is working fine. 已经创建数据库QASKU.production并使用UTF8数据加载该数据库,并且可以正常工作。 The problem is with retrieved data through select statement but not always. 问题在于通过select语句检索的数据,但并非总是如此。 When I use this statement 当我使用这个陈述

ResultSet rs = stmt.executeQuery("select * from QASKU.production");

All the data is retrieved and displayed perfectly. 检索并完美显示所有数据。

but when I use these statements: 但是当我使用这些语句时:

ResultSet rs = stmt.executeQuery("SELECT * FROM QASKU.production WHERE rhs LIKE '" + sent + "' ORDER BY prob DESC");

or 要么

String query = "select * from QASKU.production WHERE rhs = ?";
PreparedStatement pstmt = con.prepareStatement( query );
pstmt.setString( 1, sent );
ResultSet rs = pstmt.executeQuery( );

The data is retrieved and displayed perfectly but it depends on the input which I gave to this file NewFile.jsp from file input.jsp. 可以完美地检索和显示数据,但这取决于我从input.jsp文件向此文件NewFile.jsp提供的输入。

The data in the database is looking like this: 数据库中的数据如下所示:

ADJ|اسسٹنٹ|0.001222 ADJ |اسسٹنٹ| 0.001222

ADJ|اسلامی|0.01956 ADJ |اسلامی| 0.01956

ADJP|ADJ ADJ|0.098214 ADJP | ADJ ADJ | 0.098214

ADJP |ADJ ADJ.DEG|0.044643 ADJP | ADJ ADJ.DEG | 0.044643

So, when I gave ADJ as input value, the output displayed via NewFile.jsp is perfect. 因此,当我将ADJ作为输入值时,通过NewFile.jsp显示的输出是完美的。

Now, when I gave, for example, "اسسٹنٹ" as input value, the select statement did not fetch any result set from the database and it will remain empty which is a problem even the record for "اسسٹنٹ" exists in the database. 现在,当我输入“اسسٹنٹ”作为输入值时,select语句没有从数据库中获取任何结果集,并且它将保持为空,即使数据库中存在“اسسٹنٹ”的记录,这也是一个问题。

I don't think this is a problem with MySQL or JSP. 我认为这不是MySQL或JSP的问题。 I think the problem lies within the select statement, but I'm not sure. 我认为问题出在select语句之内,但我不确定。

My code file are here as below: 我的代码文件如下:

input.JSP 输入JSP

 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" >
    <title>QASKU URDU PARSER</title>

    <script type="text/javascript" >
    var ids = [];
    var blurfocus = function(id){
          document.getElementById(id).onfocus = function(){
        if(!ids[id]){ ids[id] = { id : id, val : this.value, active : false }; }
        if(this.value == ids[id].val){
          this.value = "";
        }
          };
          document.getElementById(id).onblur = function(){
        if(this.value == ""){
          this.value = ids[id].val;
        }
      }
    }

    function checkSubmit(e)
    {
       if(e && e.keyCode == 13)
       {
          document.forms[0].submit();
       }
    }
    </script>

    </head>
    <body>
    <form name="myform" action="NewFile.jsp" method="post" enctype="application/x-www-form-    urlencoded" >

       <div align="center" onKeyPress="return checkSubmit(event)">

    <h4>QASKU URDU PARSER</h4><br>
    <h5>Type sentence using Urdu/Arabic script only and then press the 'Parse' button below</h5><br>

    <textarea cols="100" rows="5" style="text-align: right" name="mytextarea" id="message" >Type here</textarea>
    <script type="text/javascript" >
    blurfocus("message");
    </script>

    <br><br>
    <input type="submit" value="Parse" >

    </div>

    </form>
    </body>
    </html>

and then the second file NewFile.jsp as below: 然后是第二个文件NewFile.jsp,如下所示:

<%@ page language="java" contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>
<%@ page import="java.sql.*" %>
<%@ page import="java.io.*" %>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Insert title here</title>
</head>
<body>
<%
try
{
String sent=request.getParameter("mytextarea");
 out.println(sent);

Statement stmt;
Connection con;
String url = "jdbc:mysql://localhost:3306/";
Class.forName("com.mysql.jdbc.Driver");
con = DriverManager.getConnection(url, "root", ""); 
//stmt = con.createStatement();
stmt = con.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,ResultSet.CONCUR_UPDATABLE);
//out.println(con.getMetaData().getDatabaseProductVersion());
//stmt.executeUpdate("DROP DATABASE QASKU");
//out.println("Deleted");
//stmt.executeUpdate("CREATE DATABASE QASKU CHARACTER SET utf8 COLLATE utf8_general_ci");
//stmt.executeUpdate("CREATE TABLE QASKU.production(lhs varchar(50) NOT NULL, rhs varchar(200) NOT NULL, prob double NOT NULL) CHARACTER SET utf8 COLLATE utf8_general_ci");
//stmt.executeUpdate("LOAD DATA LOCAL INFILE '/QAS/JSP/myfirst/WebContent/PCFG.utf' INTO TABLE QASKU.production CHARACTER SET utf8  LINES TERMINATED BY '\r' ");

//ResultSet rs = stmt.executeQuery("SELECT USER(),CHARSET(USER()),COLLATION(USER())");
//ResultSet rs = stmt.executeQuery("select * from QASKU.production");
ResultSet rs = stmt.executeQuery("SELECT * FROM QASKU.production WHERE rhs LIKE '" + sent + "' ORDER BY prob DESC");

//String query = "select * from QASKU.production WHERE rhs = ?";
//PreparedStatement pstmt = con.prepareStatement( query );
//pstmt.setString( 1, sent );
//ResultSet rs = pstmt.executeQuery( );

if(rs != null)
{
%>  
    <table align=center border="1" bgcolor="green" width="75%">
    <col width="25">
    <col width="25">
    <col width="25">
    <tr>
        <th align=left>LHS</th>
        <th align=left>RHS</th>
        <th align=left>PROBABILITIES</th>
    </tr>
<%

    while(rs.next())
    {
        out.println("<tr><td align=left>"+rs.getString(1)+"</td>");
        out.println("<td align=left>"+rs.getString(2)+"</td>");
        out.println("<td align=left>"+rs.getDouble(3)+"</td></tr>");    
    }
}
else
{
    out.println("Result Set is Emptry");
}

%>
    </table>
<%

con.close();
}
catch(Exception e)
{
out.println(e);
}
/*
try
    {
        BufferedReader reader = new BufferedReader(new FileReader("/QAS/JSP/myfirst/WebContent/PCFG.utf"));
        String text = "";
        while ((text = reader.readLine()) != null) 
            {
                out.println(text);
            }
    }
    catch(Exception e)
    {}
 */
%>

</body>
</html>

I don't know Urdu, but you probably should add %% in your LIKE . 我不认识Urdu,但您可能应该在LIKE添加%%。

Something like this: 像这样:

ResultSet rs = stmt.executeQuery("SELECT * FROM QASKU.production WHERE rhs LIKE '%" + sent + "%' ORDER BY prob DESC");

Finally, I solved my problem after 24 hours. 终于,我在24小时后解决了问题。 The problem is related to some other statement as follows: 该问题与其他一些语句相关,如下所示:

String sent=request.getParameter("mytextarea");

This statement is retrieving values from input.jsp page via post method. 该语句通过post方法从input.jsp页面检索值。 This statement is no doubt available in jsp by default but its origin is Java Servlets. 毫无疑问,该语句默认情况下在jsp中可用,但其起源是Java Servlet。 By default it takes values from the page in ASCII depends upon the two defined methods 'get' and 'post'. 默认情况下,它以ASCII格式从页面中获取值,这取决于两个定义的方法“ get”和“ post”。 So, here 'post' method was used in input.jsp, due to which the values retrieved has different format in servlets. 因此,这里在input.jsp中使用了“ post”方法,由于该方法,检索到的值在servlet中具有不同的格式。 you can read in some jsp tutorial. 您可以阅读一些jsp教程。 I solved this problem by embedding the two files input.jsp and newfile.jsp into one file, and then remove some information from the following line: 我通过将两个文件input.jsp和newfile.jsp嵌入一个文件中,然后从以下行中删除了一些信息来解决此问题:

<form name="myform" action="NewFile.jsp" method="post" enctype="application/x-www-form-urlencoded" >

and transformed into this simple form: 并转换为以下简单形式:

<form name="myform" method="get" >

now the following statement is taking values directly from the same page and not sending data to servlets: 现在,以下语句直接从同一页面获取值,并且不将数据发送到servlet:

String sent=request.getParameter("mytextarea");

This is not a big solution, but at least it is now working perfectly for Urdu language means utf8 character. 这不是一个很大的解决方案,但是至少对于乌尔都语来说,它现在可以完美地工作,这意味着utf8字符。 So, the final conclusion is the bug of retrieving ASCII values from MySQL and not retrieving utf8 values from MySQL database has a problem with this statement and not with others. 因此,最后的结论是从MySQL检索ASCII值而不从MySQL数据库检索utf8值的错误,此语句存在问题,而其他语句则不行。

The perfect solution is to add following string URIEncoding="UTF-8" in of /conf/server.xml file located in Servers directory of your project in eclipse for tomcat. 完美的解决方案是在Eclipse中为Tomcat的项目的Servers目录中的/conf/server.xml文件中添加以下字符串URIEncoding =“ UTF-8”。 Then all the encoding/decoding will be automatically. 然后,所有编码/解码将自动进行。 Short but perfect solution 短暂但完美的解决方案

<Connector URIEncoding="UTF-8" ...........>

This is the best solution ever have and now pray for me. 这是有史以来最好的解决方案,现在为我祈祷。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM