使用变音符号的响应文本的内容长度错误

Question

There is a problem associated with umlaut. 与变音符号有关联的问题。 I get description on request: 我根据要求获得描述：

@RequestMapping(value = "/description", method = RequestMethod.POST, consumes = "application/json", produces = "text/plain;charset=UTF-8")
    @ResponseBody
    private String getDescription() {        

        return "ärchik";
    }

on frontend response.responseText fails to score the last letter response.responseText = "ärchi" 在前端response.responseText无法得分最后一个字母的响应.responseText =“ärchi”

i found that the problem in the wrong Content-Length: 7 if set Content-Length:8, then it will work and return full description "ärchik" 我发现错误的Content-Length中的问题：7如果设置Content-Length：8，那么它将工作并返回完整描述“ärchik”

But i do not understand why 8? 但我不明白为什么8？

"ärchik".getBytes("UTF-8").length = 7

Response Headers 响应标题

Cache-Control:must-revalidate 缓存控制：必须-重新验证

Content-Length:7 内容长度：7

Content-Type:text/plain;charset=utf-8 内容类型：文本/无格式;字符集= utf-8的

Date:Mon, 14 Apr 2014 09:08:26 GMT 日期：星期一，2014年4月14日09:08:26 GMT

Server:Apache-Coyote/1.1 服务器：Apache-狼/ 1.1

Answer 1

I'm turning the core of my comment into an answer, since it seems I was on the right track. 我正在将评论的核心转化为答案，因为我似乎正走在正确的轨道上。

The most likely reason for the string to be one byte longer than expected is that the 'ä' got encoded as three bytes not two. 字符串比预期长一个字节的最可能原因是'ä'被编码为三个字节而不是两个字节。 This can happen if one uses not the precomposed codepoint U+00E4 (UTF-8: c3 a4 ) but instead the letter 'a' (which is a simple ASCII letter at U+0061) followed by the combining diaresis U+0308, together encoded as 61 cc 88 . 如果不使用预先组合的代码点U + 00E4（UTF-8： c3 a4 ）而是使用字母'a' （U + 0061处的简单ASCII字母），然后是组合 diaresis U + 0308，则会发生这种情况编码为61 cc 88 。 There are several normal forms for Unicode , and the longer encoding would usually be the result of conversion to NFD. Unicode有几种常规形式，较长的编码通常是转换为NFD的结果。

Looking at your own answer, it seems you did just that normalization, at a point where the content length already was determined from the un-normalized (or perhaps NFC-normalized) string. 看看你自己的答案，你似乎只做了那个规范化，其中内容长度已经从非规范化（或者可能是NFC规范化的）字符串中确定。

Answer 2

It's my fault (( I working out the filter 这是我的错（（我正在设计过滤器

//set content-length = 7    
chain.doFilter(request, wrappedResponse); 
byte[] bytes = wrappedResponse.getByteArray(); 
String out = new String(bytes, utf8Charset);//7bytes 
out = Normalizer.normalize(out , Normalizer.Form.NFD);//8bytes

Answer 3

spring/tomcat response is right. spring / tomcat的反应是对的。

  response.responseText is Ajax response Object?

I guess: js file encoding not UTF-8 ; 我猜：js文件编码不是UTF-8; some function is not work for UTF-8 of javascript. 某些功能不适用于javascript的UTF-8。

使用变音符号的响应文本的内容长度错误

问题描述

3 个解决方案

解决方案1
4 已采纳 2014-04-18 16:05:17

解决方案2
1 2014-04-18 15:48:33

解决方案3
0 2014-04-16 03:12:45

使用变音符号的响应文本的内容长度错误

问题描述

3 个解决方案

解决方案1 4 已采纳 2014-04-18 16:05:17

解决方案2 1 2014-04-18 15:48:33

解决方案3 0 2014-04-16 03:12:45

解决方案1
4 已采纳 2014-04-18 16:05:17

解决方案2
1 2014-04-18 15:48:33

解决方案3
0 2014-04-16 03:12:45