简体   繁体   English

当表单作为 multipart/form-data 发布时,UTF-8 文本出现乱码

[英]UTF-8 text is garbled when form is posted as multipart/form-data

I'm uploading a file to the server.我正在将文件上传到服务器。 The file upload HTML form has 2 fields:文件上传 HTML 表单有 2 个字段:

  1. File name - A HTML text box where the user can give a name in any language.文件名 - 一个 HTML 文本框,用户可以在其中指定任何语言的名称。
  2. File upload - A HTMl 'file' where user can specify a file from disk to upload.文件上传 - 一个 HTMl“文件”,用户可以在其中指定要从磁盘上传的文件。

When the form is submitted, the file contents are received properly.当表单提交时,文件内容被正确接收。 However, when the file name (point 1 above) is read, it is garbled.但是,读取文件名(上面的第1点)时,却是乱码。 ASCII characters are displayed properly.正确显示 ASCII 字符。 When the name is given in some other language (German, French etc.), there are problems.当名称以其他语言(德语、法语等)命名时,就会出现问题。

In the servlet method, the request's character encoding is set to UTF-8.在servlet 方法中,请求的字符编码设置为UTF-8。 I even tried doing a filter as mentioned -我什至尝试做一个如上所述的过滤器 - How can I make this code to submit a UTF-8 form textarea with jQuery/Ajax work? 如何使此代码使用 jQuery/Ajax 提交 UTF-8 表单 textarea 工作? - but it doesn't seem to work. - 但它似乎不起作用。 Only the filename seems to be garbled.只有文件名似乎是乱码。

The MySQL table where the file name goes supports UTF-8.文件名所在的 MySQL 表支持 UTF-8。 I gave random non-English characters & they are stored/displayed properly.我给出了随机的非英文字符,它们被正确存储/显示。

Using Fiddler, I monitored the request & all the POST data is passed correctly.使用 Fiddler,我监视了请求并且所有 POST 数据都正确传递。 I'm trying to identify how/where the data could get garbled.我正在尝试确定数据如何/哪里会出现乱码。 Any help will be greatly appreciated.任何帮助将不胜感激。

I had the same problem using Apache commons-fileupload.我在使用 Apache commons-fileupload 时遇到了同样的问题。 I did not find out what causes the problems especially because I have the UTF-8 encoding in the following places: 1. HTML meta tag 2. Form accept-charset attribute 3. Tomcat filter on every request that sets the "UTF-8" encoding我没有找出导致问题的原因,特别是因为我在以下位置使用 UTF-8 编码:1. HTML 元标记 2. 表单接受字符集属性 3. Tomcat 过滤器对每个设置“UTF-8”的请求编码

-> My solution was to especially convert Strings from ISO-8859-1 (or whatever is the default encoding of your platform) to UTF-8: -> 我的解决方案是特别将字符串从 ISO-8859-1(或任何平台的默认编码)转换为 UTF-8:

new String (s.getBytes ("iso-8859-1"), "UTF-8");

hope that helps希望有帮助

Edit: starting with Java 7 you can also use the following:编辑:从 Java 7 开始,您还可以使用以下内容:

new String (s.getBytes (StandardCharsets.ISO_8859_1), StandardCharsets.UTF_8);

Just use Apache commons upload library.只需使用 Apache 公共上传库。 Add URIEncoding="UTF-8" to Tomcat's connector, and use FileItem.getString("UTF-8") instead of FileItem.getString() without charset specified.URIEncoding="UTF-8"添加到 Tomcat 的连接器,并使用 FileItem.getString("UTF-8") 而不是 FileItem.getString() 没有指定字符集。

Hope this help.希望这有帮助。

I got stuck with this problem and found that it was the order of the call to我被这个问题困住了,发现这是调用的顺序

request.setCharacterEncoding("UTF-8");

that was causing the problem.那是造成问题的原因。 It has to be called before any all call to request.getParameter(), so I made a special filter to use at the top of my filter chain.它必须在对 request.getParameter() 的所有调用之前调用,因此我制作了一个特殊的过滤器以在过滤器链的顶部使用。

https://rogerkeays.com/servletrequest-setcharactercoding-ignored https://rogerkeays.com/servletrequest-setcharactercoding-ignored

I had the same problem and it turned out that in addition to specifying the encoding in the Filter我有同样的问题,结果证明除了在过滤器中指定编码

request.setCharacterEncoding("UTF-8");
response.setCharacterEncoding("UTF-8");

it is necessary to add "acceptcharset" to the form有必要在表单中添加“acceptcharset”

<form method="post" enctype="multipart/form-data" acceptcharset="UTF-8" > 

and run the JVM with运行JVM

-Dfile.encoding=UTF-8

The HTML meta tag is not necessary if you send it in the HTTP header using response.setCharacterEncoding().如果您使用 response.setCharacterEncoding() 在 HTTP 标头中发送 HTML 元标记,则不需要它。

In case someone stumbled upon this problem when working on Grails (or pure Spring) web application, here is the post that helped me:如果有人在使用 Grails(或纯 Spring)Web 应用程序时偶然发现了这个问题,以下是对我有帮助的帖子:

http://forum.spring.io/forum/spring-projects/web/2491-solved-character-encoding-and-multipart-forms http://forum.spring.io/forum/spring-projects/web/2491-solved-character-encoding-and-multipart-forms

To set default encoding to UTF-8 (instead of the ISO-8859-1) for multipart requests, I added the following code in resources.groovy (Spring DSL):为了将多部分请求的默认编码设置为 UTF-8(而不是 ISO-8859-1),我在 resources.groovy(Spring DSL)中添加了以下代码:

multipartResolver(ContentLengthAwareCommonsMultipartResolver) {
    defaultEncoding = 'UTF-8'
}

I'm using org.apache.commons.fileupload.servlet.ServletFileUpload.ServletFileUpload(FileItemFactory) and defining the encoding when reading out parameter value:我正在使用org.apache.commons.fileupload.servlet.ServletFileUpload.ServletFileUpload(FileItemFactory)并在读出参数值时定义编码:

List<FileItem> items = new ServletFileUpload(new DiskFileItemFactory()).parseRequest(request);

for (FileItem item : items) {
    String fieldName = item.getFieldName();

    if (item.isFormField()) {
        String fieldValue = item.getString("UTF-8"); // <-- HERE

The filter is key for IE.过滤器是 IE 的关键。 A few other things to check;需要检查的其他一些事项;

What is the page encoding and character set?什么是页面编码和字符集? Both should be UTF-8两者都应该是 UTF-8

<%@ page language="java" contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>

What is the character set in the meta tag?元标记中的字符集是什么?

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

Does your MySQL connection string specify UTF-8?您的 MySQL 连接字符串是否指定了 UTF-8? eg例如

jdbc:mysql://127.0.0.1/dbname?requireSSL=false&useUnicode=true&characterEncoding=UTF-8

I am using Primefaces with glassfish and SQL Server.我正在使用带有 glassfish 和 SQL Server 的 Primefaces。

in my case i created the Webfilter, in back-end, to get every request and convert to UTF-8, like this:就我而言,我在后端创建了 Webfilter,以获取每个请求并转换为 UTF-8,如下所示:

package br.com.teste.filter;

import java.io.IOException;

import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.annotation.WebFilter;

@WebFilter(servletNames={"Faces Servlet"})
public class Filter implements javax.servlet.Filter {

    @Override
    public void destroy() {
        // TODO Auto-generated method stub

    }

    @Override
    public void doFilter(ServletRequest request, ServletResponse response,
            FilterChain chain) throws IOException, ServletException {
        request.setCharacterEncoding("UTF-8");
        chain.doFilter(request, response);      
    }

    @Override
    public void init(FilterConfig filterConfig) throws ServletException {
        // TODO Auto-generated method stub      
    }

}

In the View (.xhtml) i need to set the enctype paremeter's form to UTF-8 like @Kevin Rahe:在视图 (.xhtml) 中,我需要像 @Kevin Rahe 一样将 enctype 参数表的形式设置为 UTF-8:

    <h:form id="frmt" enctype="multipart/form-data;charset=UTF-8" >
         <!-- your code here -->
    </h:form>  

I had the same problem.我有同样的问题。 The only solution that worked for me was adding <property = "defaultEncoding" value = "UTF-8"> to multipartResoler in spring configurations file.唯一对我有用的解决方案是将 <property = "defaultEncoding" value = "UTF-8"> 添加到 spring 配置文件中的 multipartResoler。

您还必须确保 web.xml 中的编码过滤器 (org.springframework.web.filter.CharacterEncodingFilter) 在多部分过滤器 (org.springframework.web.multipart.support.MultipartFilter) 之前映射。

The filter thing and setting up Tomcat to support UTF-8 URIs is only important if you're passing the via the URL's query string, as you would with a HTTP GET.过滤器和设置 Tomcat 以支持 UTF-8 URI 仅在您通过 URL 的查询字符串传递时才重要,就像使用 HTTP GET 一样。 If you're using a POST, with a query string in the HTTP message's body, what's important is going to be the content-type of the request and this will be up to the browser to set the content-type to UTF-8 and send the content with that encoding.如果您使用 POST,在 HTTP 消息的正文中带有查询字符串,那么重要的是请求的内容类型,这取决于浏览器将内容类型设置为 UTF-8 和发送具有该编码的内容。

The only way to really do this is by telling the browser that you can only accept UTF-8 by setting the Accept-Charset header on every response to "UTF-8;q=1,ISO-8859-1;q=0.6".真正做到这一点的唯一方法是通过将每个响应的 Accept-Charset 标头设置为“UTF-8;q=1,ISO-8859-1;q=0.6”来告诉浏览器您只能接受 UTF-8 . This will put UTF-8 as the best quality and the default charset, ISO-8859-1, as acceptable, but a lower quality.这将使 UTF-8 成为最佳质量,默认字符集 ISO-8859-1 可以接受,但质量较低。

When you say the file name is garbled, is it garbled in the HttpServletRequest.getParameter's return value?说文件名乱码,是不是在HttpServletRequest.getParameter的返回值里面乱码了?

I think i'am late for the party but when you use a wildfly, you can add an default-encoding to the standalone.xml.我想我迟到了,但是当你使用野蝇时,你可以向 standalone.xml 添加一个默认编码。 Just search in the standalone.xml for只需在 standalone.xml 中搜索

<servlet-container name="default"> 

and add encoding like this:并添加这样的编码:

<servlet-container name="default" default-encoding="UTF-8">

To avoid converting all request parameters manually to UTF-8, you can define a method annotated with @InitBinder in your controller:为了避免手动将所有请求参数转换为 UTF-8,您可以在控制器中定义一个用@InitBinder注释的方法:

@InitBinder
protected void initBinder(WebDataBinder binder) {
    binder.registerCustomEditor(String.class, new CharacterEditor(true) {
        @Override
        public void setAsText(String text) throws IllegalArgumentException {
            String properText = new String(text.getBytes(StandardCharsets.ISO_8859_1), StandardCharsets.UTF_8);
            setValue(properText);
        }
    });
}

The above will automatically convert all request parameters to UTF-8 in the controller where it is defined.以上将在定义它的控制器中自动将所有请求参数转换为UTF-8。

You do not use UTF-8 to encode text data for HTML forms.您不使用 UTF-8 来编码 HTML 表单的文本数据。 The html standard defines two encodings, and the relevant part of that standard is here . html 标准定义了两种编码,该标准的相关部分在这里 The "old" encoding, than handles ascii, is application/x-www-form-urlencoded. “旧”编码,而不是处理 ascii,是 application/x-www-form-urlencoded。 The new one, that works properly, is multipart/form-data.新的可以正常工作的是 multipart/form-data。

Specifically, the form declaration looks like this:具体来说,表单声明如下所示:

 <FORM action="http://server.com/cgi/handle"
       enctype="multipart/form-data"
       method="post">
   <P>
   What is your name? <INPUT type="text" name="submit-name"><BR>
   What files are you sending? <INPUT type="file" name="files"><BR>
   <INPUT type="submit" value="Send"> <INPUT type="reset">
 </FORM>

And I think that's all you have to worry about - the webserver should handle it.我认为这就是你所需要担心的——网络服务器应该处理它。 If you are writing something that directly reads the InputStream from the web client, then you will need to read RFC 2045 and RFC 2046 .如果您正在编写直接从 Web 客户端读取 InputStream 的内容,那么您将需要阅读RFC 2045RFC 2046

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 处理中文字符时,被enctype =“ multipart / form-data”忽略的UTF-8提交表单 - UTF-8 Ignored by enctype=“multipart/form-data” for Form submit when handling Chinese charactors Android RestTemplate使用utf-8发布multipart / form-data - Android RestTemplate post multipart/form-data with utf-8 UTF-8编码Java / Spring(多部分/表单数据) - UTF-8 encoding Java/Spring (Multipart/form-data) 如何使用RESTeasy从multipart / form-data请求获取text / xml为UTF-8? - How to get text/xml as UTF-8 from a multipart/form-data request with RESTeasy? Java Spring:不支持内容类型“multipart/form-data;boundary;charset=UTF-8” - Java Spring: Content type 'multipart/form-data;boundary ;charset=UTF-8' not supported 内容类型 'multipart/form-data;boundary=----...;charset=UTF-8' 不支持 - Content type 'multipart/form-data;boundary=----...;charset=UTF-8' not supported 不支持 Spring Boot 内容类型 'multipart/form-data;boundary=------------------------#;charset=UTF-8' - Spring Boot Content type 'multipart/form-data;boundary=--------------------------#;charset=UTF-8' not supported servlet 是否可以确定发布的数据是否为多部分/表单数据? - Can a servlet determine if the posted data is multipart/form-data? 处理多部分/表单数据 - processing multipart/form-data multipart / form-data的问题 - Issue with multipart/form-data
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM