简体   繁体   English

UTF-8使用Tomcat编码servlet表单提交

[英]UTF-8 encoding a servlet form submission with Tomcat

I'm attempting to post a simple form that includes unicode characters to a servlet action. 我正在尝试将包含unicode字符的简单表单发布到servlet操作。 On Jetty, everything works without a snag. 在Jetty上,一切都没有障碍。 On a Tomcat server, utf-8 characters get mangled. 在Tomcat服务器上,utf-8字符会被破坏。

The simplest case I've got: 我得到的最简单的案例:

Form: 形成:

<form action="action" method="post">
  <input type="text" name="data" value="It’s fine">`
</form>`

Action: 行动:

class MyAction extends ActionSupport {   
  public void setData(String data) {
    // data is already mangled here in Tomcat
  } 
}
  • I've got URIEncoding="UTF-8" on <Connector> in server.xml 我在server.xml中的<Connector>上有URIEncoding =“UTF-8”
  • The first filter on the action calls request.setCharacterEncoding("UTF-8"); 动作的第一个过滤器调用request.setCharacterEncoding(“UTF-8”);
  • The content type of the page that contains the form is "text/html; charset=UTF-8" 包含表单的页面的内容类型是“text / html; charset = UTF-8”
  • Adding "accept-charset" to the form makes no difference 在表单中添加“accept-charset”没有任何区别

The only two ways I can make it work are to use Jetty or to switch it to method="get". 我能使其工作的唯一两种方法是使用Jetty或将其切换为method =“get”。 Both of those cause the characters to come through without a problem. 这两个都导致角色没有问题。

I've got URIEncoding="UTF-8" on <Connector> in server.xml 我在server.xml中的<Connector>上有URIEncoding =“UTF-8”

That's only relevant for GET requests. 这仅与GET请求相关。


The first filter on the action calls request.setCharacterEncoding("UTF-8"); 动作的第一个过滤器调用request.setCharacterEncoding("UTF-8");

Fine, that should apply on POST requests. 好的,应该适用于POST请求。 You only need to make sure that if you haven't called getParameter() , getReader() , getInputStream() or anything else which would trigger parsing the request body before calling setCharacterEncoding() . 你只需要确保,如果你已经不叫getParameter() getReader() getInputStream()或任何其他可能会触发之前调用解析请求主体setCharacterEncoding()


The content type of the page that contains the form is "text/html; charset=UTF-8" 包含表单的页面的内容类型是"text/html; charset=UTF-8"

How exactly are you setting it? 究竟是如何设置它的? If done in a <meta> , then you need to understand that this is ignored by the browser when the page is served over HTTP and the HTTP Content-Type response header is present. 如果在<meta> ,那么您需要了解当页面通过HTTP提供并且存在HTTP Content-Type响应头时,浏览器会忽略它。 The average webserver namely already sets it by default. 平均网络服务器默认已经设置了它。 The <meta> content type will then only be used when the page is saved to local disk and viewed from there. 然后,只有在将页面保存到本地磁盘并从那里查看时才会使用<meta>内容类型。

To set the response header charset properly, add the following to top of your JSP: 要正确设置响应标头字符集,请将以下内容添加到JSP的顶部:

<%@page pageEncoding="UTF-8" %>

This will by the way also tell the server to send the response in the given charset. 顺便说一句,这也会告诉服务器在给定的字符集中发送响应。


Adding "accept-charset" to the form makes no difference 在表单中添加“accept-charset”没有任何区别

It only makes difference in MSIE, but even then it is using it wrongly. 它只会在MSIE中产生差异,但即便如此,它也会错误地使用它。 The whole attribute is worthless anyway. 无论如何,整个属性都是毫无价值的。 Forget it. 算了吧。

See also: 也可以看看:

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM