简体   繁体   English

在 Java 中解析 Content-Type 标头而不验证字符集

[英]Parsing a Content-Type header in Java without validating the charset

Given an HTTP header like:给定一个 HTTP 标头,例如:

Content-Type: text/plain; charset=something

I'd like to extract the MIME type and charset using full RFC-compliant parsing, but without "validating" the charset.我想使用完全符合 RFC 的解析来提取 MIME 类型和字符集,但不“验证”字符集。 By validating, I mean that I don't want to use Java's internal Charset mechanism, in case the charset is unknown to Java (but may still have meaning for other applications).通过验证,我的意思是我不想使用 Java 的内部字符集机制,以防 Java 不知道字符集(但对于其他应用程序可能仍然有意义)。 The following code does not work because it does this validation:以下代码不起作用,因为它执行此验证:

import org.apache.http.entity.ContentType;

String header = "text/plain; charset=something";

ContentType contentType = ContentType.parse(header);
Charset contentTypeCharset = contentType.getCharset();

System.out.println(contentType.getMimeType());
System.out.println(contentTypeCharset == null ? null : contentTypeCharset.toString());

This throws java.nio.charset.UnsupportedCharsetException: something .这会抛出java.nio.charset.UnsupportedCharsetException: something

To do the parsing one can use lower-level parsing classes:要进行解析,可以使用较低级别的解析类:

import org.apache.http.HeaderElement;
import org.apache.http.NameValuePair;
import org.apache.http.message.BasicHeaderValueParser;

String header = "text/plain; charset=something";

HeaderElement headerElement = BasicHeaderValueParser.parseHeaderElement(header, null);
String mimeType = headerElement.getName();
String charset = null;
for (NameValuePair param : headerElement.getParameters()) {
    if (param.getName().equalsIgnoreCase("charset")) {
        String s = param.getValue();
        if (!StringUtils.isBlank(s)) {
            charset = s;
        }
        break;
    }
}

System.out.println(mimeType);
System.out.println(charset);

Alternatively one can still use the Apache's parse and catch the UnsupportedCharsetException for extracting the name using getCharsetName()或者,仍然可以使用Apache 的解析并捕获UnsupportedCharsetException以使用getCharsetName()提取名称

import org.apache.http.entity.ContentType;

String header = "text/plain; charset=something";

String charsetName;
String mimeType;

try {
  ContentType contentType = ContentType.parse(header); // here exception may be thrown
   mimeType = contentType.getMimeType();
   Charset charset = contentType.getCharset();
   charsetName = charset != null ? charset.name() : null;
} catch( UnsupportedCharsetException e) {
    charsetName = e.getCharsetName(); // extract unsupported charsetName
    mimeType = header.substring(0, header.indexOf(';')); // in case of exception, mimeType needs to be parsed separately
}

Drawback is that mimeType also needs to be extracted differently in case of UnsupportedCharsetException.缺点是在 UnsupportedCharsetException 的情况下还需要以不同的方式提取mimeType

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Java应用程序服务器如何决定在HTTP Content-type标头中发送什么字符集? - How does a Java application server decide what charset to send in the HTTP Content-type header? 即使不是文本MIME类型,Undertow也会始终向Content-Type标头添加charset - Undertow always add charset to Content-Type header even if it is not a text MIME type 如何在IBM HTTPD服务器上设置Content-Type HTTP标头的字符集部分? - How do I set the charset portion of the Content-Type HTTP Header on an IBM HTTPD Server? 如何在 Spring boot(v2.4.2) 中从 Content-Type 响应头中删除 charset=UTF-8 - How to remove charset=UTF-8 from Content-Type response header in Spring boot(v2.4.2) 有没有办法从Java中的文件扩展名生成Content-Type标头? - is there a way to generate a Content-Type header from a file extension in Java? 在Content-Type中指定charset时,Jersey和@FormParam无法正常工作 - Jersey and @FormParam not working when charset is specified in the Content-Type 如何抑制字符集被自动添加到 okhttp 中的 Content-Type - How to suppress Charset being automatically added to Content-Type in okhttp 是内容类型“text/xml; 字符集=utf-8”错了吗? - Is content-type “text/xml; charset=utf-8” wrong? 解析“多部分/替代”内容类型 - Parsing 'multipart/alternative' content-type Spring是否更改Content-Type的标题? - Spring Changes Header for Content-Type?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM