字符串字符编码

Question

We developed a specific exporter for them which allows the position based product to provide a type of portfolio snapshot - both for equities and fixed income portfolios. 我们为他们开发了一个特定的出口商，该出口商允许基于头寸的产品提供一种投资组合快照-适用于股票和固定收益投资组合。

We developed a specific exporter for them which allows the position based product to provide a type of portfolio snapshot â€“ both for equities and fixed income portfolios. 我们为他们开发了一个特定的出口商，该出口商允许基于头寸的产品提供一种投资组合快照（包括股票和固定收益投资组合）。

The 1st text is what I copy from Jira, the second is what gets printed in Cognity. 第一个文本是我从吉拉复制的，第二个文本是在Cognity中打印的。 I get the text from Jira in a JSON format via the REST API and format it with a string builder and finally return a normal String as the output. 我通过REST API从Jira以JSON格式获取文本，并使用字符串生成器对其进行格式化，最后返回正常的String作为输出。 All the symbols like " ' - etc. don't get printed right and I get a lot of â€“ in the output text. How can I fix that? I was thinking if there was some way I could change the encoding of the output String, maybe that might work? 所有符号（例如" ' -等）都无法正确打印，并且输出文本中出现很多â€“ 。如何解决？”我在想是否可以通过某种方式更改输出字符串，也许可行吗？

EDIT: This is what I use to get the information from Jira after which I extract what I want from the JSON returned. 编辑：这就是我用来从Jira获取信息的方法，之后我从返回的JSON中提取所需的信息。

   String usercreds = "?os_username=user&os_password=password";
   try {
        url = new URL("http://jira/rest/api/2/issue/" + issuekey + usercreds);

        URLConnection urlConnection = url.openConnection();

        if (url.getUserInfo() != null) {
            String basicAuth = "Basic " + new String(new Base64().encode(url.getUserInfo().getBytes()));
            urlConnection.setRequestProperty("Authorization", basicAuth);
        }

        InputStream inputStream = urlConnection.getInputStream();
        BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
        while ((s = reader.readLine()) != null) {
            temp.append(s);
            s = "";
        }
        issue = new JSONObject(temp.toString());
        temp.setLength(0);
    } catch (IOException e) {
        e.printStackTrace();
    } catch (JSONException e) {
        e.printStackTrace();
    }

If I understood correctly, there should be a way for me to specify that I want the output to be ("application/json;charset=utf-8") somewhere in this code and that might solve my prolbem? 如果我正确理解，应该有一种方法可以指定我希望输出在此代码中的某个位置("application/json;charset=utf-8") ，并且这可能会解决我的问题？

Answer 1

The dash in the JSON response is U+2013 (EN DASH.) When encoded as UTF-8 if forms the byte sequence e2 80 93 . JSON响应中的破折号为U + 2013（EN DASH）。如果编码为UTF-8，则形成字节序列e2 80 93 。 This data is being decoded using the wrong encoding (windows-1252 most likely.) Java's default I/O encoding is system-dependent. 正在使用错误的编码来解码此数据（很可能是Windows-1252。）Java的默认I / O编码取决于系统。

BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));

The above line is at fault. 上面的线有故障。 You must specify an encoding when transcoding using InputStreamReader . 使用InputStreamReader转码时，必须指定编码。

For example: 例如：

  public static void readUtf8(URLConnection connection, Appendable out)
      throws IOException {
    CharBuffer buffer = CharBuffer.allocate(1024);
    try (InputStream in = connection.getInputStream();
    Reader reader = new InputStreamReader(in, StandardCharsets.UTF_8)) {
      while (reader.read(buffer) != -1) {
        buffer.flip();
        out.append(buffer);
        buffer.clear();
      }
    }
  }

Note: technically, JSON can be any Unicode encoding (not just UTF-8) - if you need to handle that read this . 注意：从技术上讲，JSON可以是任何Unicode编码（而不仅仅是UTF-8）-如果您需要处理读取this的话。

Note 2: HttpUrlConnection seems to have improved since Java 5, but I would make sure it does automatic length handling (reading Content-Length header/handling chunked encoding/etc.) 注意2：自Java 5起， HttpUrlConnection似乎有所改进，但是我要确保它可以自动进行长度处理（读取Content-Length标头/处理分块编码等）。

字符串字符编码

问题描述

1 个解决方案

解决方案1
3 已采纳 2013-12-11 16:28:48

字符串字符编码

问题描述

1 个解决方案

解决方案1 3 已采纳 2013-12-11 16:28:48

解决方案1
3 已采纳 2013-12-11 16:28:48