[英]How to force browser to set charset in content-type http header
A simple HTML file: 一个简单的HTML文件:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<form method="POST" action="test.jsp" accept-charset="utf-8" method="post" enctype="application/x-www-form-urlencoded" >
<input type="text" name="P"/>
<input type="submit" value="subMit"/>
</form>
</body>
</html>
The HTML file is served by the server using header Content-Type:text/html; charset=utf-8
HTML文件由服务器使用标题Content-Type:text/html; charset=utf-8
Content-Type:text/html; charset=utf-8
. Content-Type:text/html; charset=utf-8
。 Everything says: "dear browser when you post this form, please post it utf-8 encoded". 一切都说:“亲爱的浏览器,当你发布这个表格,请发布utf-8编码”。 The browser actually does this. 浏览器实际上是这样做的。 Every value entered in the input field will be UTF-8 encoded. 输入字段中输入的每个值都将采用UTF-8编码。 BUT the browser wont tell this to the server! 但是浏览器不会告诉服务器这个! The HTTP header of the post request will contain a Content-Type:application/x-www-form-urlencoded
field but the charset will be omitted (tested with FF3.6 and IE8). post请求的HTTP标头将包含Content-Type:application/x-www-form-urlencoded
字段,但字符集将被省略(使用FF3.6和IE8进行测试)。
The problem is the application server I use (Tomcat6) expects the charset in the Content-Type header (as stated in RFC2388). 问题是我使用的应用程序服务器(Tomcat6)期望Content-Type标头中的字符集(如RFC2388中所述)。 Like this: Content-Type:application/x-www-form-urlencoded;charset=utf-8
. 像这样: Content-Type:application/x-www-form-urlencoded;charset=utf-8
。 If the charset is omitted it will assume ISO-8859-1 which is not the charset used for encoding. 如果省略字符集,它将采用ISO-8859-1,而不是用于编码的字符集。 The result is broken data. 结果是数据损坏。
Does some one have a clue how to force the current browsers to append the charset to the Content-Type header? 是否有人知道如何强制当前浏览器将charset附加到Content-Type标头?
Does some one have a clue how to force the current browsers to append the charset to the Content-Type header? 是否有人知道如何强制当前浏览器将charset附加到Content-Type标头?
No, no browser has ever supplied a charset
parameter with the application/x-www-form-urlencoded
media type. 不,没有浏览器曾经使用application/x-www-form-urlencoded
媒体类型提供了charset
参数。 What's more, the HTML spec which defines that type, does not propose a charset
parameter, so the server can't reasonably expect to get one. 更重要的是,定义该类型的HTML规范没有提出charset
参数,因此服务器无法合理地期望得到一个。
(HTML4 does expect a charset
for the subparts of a multipart/form-data
submission, but even in that case no browser actually complies.) (HTML4 确实期望为multipart/form-data
提交的子multipart/form-data
提供charset
,但即使在这种情况下,也没有浏览器实际符合。)
accept-charset="utf-8" 接收字符集= “UTF-8”
accept-charset
is broken in IE, and shouldn't be used. 在IE中, accept-charset
已被破坏,不应使用。 It won't make a difference either way for forms in pages served as UTF-8, but in other cases it can end up with inconsistent results. 对于作为UTF-8的页面中的表单,它不会产生任何影响,但在其他情况下,它最终会产生不一致的结果。
No, with forms you just have to serve the page they're in as UTF-8, and the results should come back as UTF-8 (with no identifying marks to tell you that (except potentially for the _charset_
hack , but Tomcat doesn't support that). 不,使用表单你只需要以UTF-8的形式提供页面,结果应该以UTF-8的形式返回(没有识别标记告诉你(除了可能是_charset_
hack ,但是Tomcat没有)不支持。
So you have to tell the Servlet container what encoding to use for parameters if you don't want it to fall back to its default (which is usually wrong). 因此,如果您不希望它回退到默认值(通常是错误的),您必须告诉Servlet容器用于参数的编码。 In a limited set of circumstances you may be able to call ServletRequest.setCharacterEncoding()
to do this, but this tends to be brittle, and doesn't work at all for parameters taken from the query string. 在一组有限的情况下,您可以调用ServletRequest.setCharacterEncoding()
来执行此操作,但这往往很脆弱,并且对于从查询字符串中获取的参数根本不起作用。 There's not a standardised Servlet-level fix for this, sadly. 遗憾的是,没有标准化的Servlet级别修复。 For Tomcat you usually have to muck about with the server.xml instead of being able to fix it in the app. 对于Tomcat,您通常需要使用server.xml,而不是能够在应用程序中修复它。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.