[英]UTF-8 text is garbled when form is posted as multipart/form-data
[英]How to get text/xml as UTF-8 from a multipart/form-data request with RESTeasy?
谢谢你的回答,但使用InputStream而不是使用getBody(...)也行不通。 下面的代码返回与原始帖子中的结果相同的结果。
final InputStream inStream = fileUploadInput.getFormDataPart(searchedInput, InputStream.class, null);
// get bytes
final byte[] inBytes = new byte[1024];
final ByteArrayOutputStream outBytes = new ByteArrayOutputStream(inBytes.length);
int length = 0;
while((length = inStream.read(inBytes)) >= 0) {
outBytes.write(inBytes, 0, length);
}
final byte[] rawInput = outBytes.toByteArray();
// get Encoding
final String asciiInput = new String(rawInput, ASCII);
final String utf8 = new String(rawInput, UTF8);
final String isoLatin1 = new String(rawInput, ISO8859_1);
log.info("ASCII: " + ascii);
log.info("UTF8: " + utf8);
log.info("ISOLATIN1: " + isoLatin1);
return utf8;
我想使用下面的HTML表单上传UTF-8编码的XML文件,并使用RESTEasy MultipartFormDataInput
和下面显示的Java代码在服务器上读取它。 在服务器端,我似乎得到了ASCII编码文件的内容,与上传文件的实际编码(UTF-8)无关(以下面描述的方式访问它)。 所有不属于ASCII字符集的字符都被替换为?
。 如何从RESTeasy的'multipart / form-data'请求中将'text / xml'作为UTF-8? (我知道可以编写一个PreProcessor - Interceptor并在那里获取原始字节,但我不能在我的应用程序中使用这种方法)。
上传表格:
<html>
<body>
<h1>JAX-RS Upload Form</h1>
<form action="http://.../upload" method="POST" enctype="multipart/form-data">
<p>Select a file : <input type="file" name="upload"/></p>
<input type="submit" value="Upload It" />
</form>
</body>
</html>
资源类:
@Path("/upload")
@POST
@Consumes("multipart/form-data")
public Response createUploadTemplate(
@Context HttpServletRequest req,
MultipartFormDataInput formInput) {
try {
final String templateXml = getInput("upload", formInput);
//...
} catch (Exception e) {
//...
}
}
private static String getInput(final String searchedInput, final MultipartFormDataInput fileUploadInput) throws BadRequestException, IOException {
try {
final Map<String, List<InputPart>> inputToInputPart = fileUploadInput.getFormDataMap();
if(inputToInputPart.containsKey(searchedInput)) {
final StringBuilder builder = new StringBuilder();
final List<InputPart> inputParts = inputToInputPart.get(searchedInput);
for(InputPart inputPart : inputParts) {
builder.append(inputPart.getBody(String.class,null));
}
return builder.toString();
} else {
throw new BadRequestException("The form send with the request does not contain an input element " + searchedInput + ".");
}
} catch(Exception e) {
throw new BadRequestException("The file upload failed.", e);
}
}
化MessageBodyReader:
@Provider
@Consumes ("text/xml")
public class XmlStringReader implements MessageBodyReader<String> {
private static Logger log = LoggerFactory.getLogger(UploadedXmlStringReader.class);
private static final String ASCII = "ASCII";
private static final String ISO8859_1 = "ISO8859_1";
private static final String UTF8 = "UTF8";
@Override
public boolean isReadable(final Class<?> type,
final Type genericType,
final Annotation[] annotations,
final MediaType mediaType) {
boolean result = type.equals(String.class) && MediaType.TEXT_XML_TYPE.equals(mediaType);
log.info(MessageFormat.format("{0} == String.class && MediaType.TEXT_XML_TYPE == {1}: {2}", type, mediaType, result));
return result;
}
@Override
public String readFrom(final Class<String> type,
final Type genericType,
final Annotation[] annotations,
final MediaType mediaType,
final MultivaluedMap<String, String> httpHeaders,
final InputStream entityStream) throws IOException, WebApplicationException {
final byte[] inBytes = new byte[1024];
final ByteArrayOutputStream outBytes = new ByteArrayOutputStream(inBytes.length);
int length = 0;
while((length = entityStream.read(inBytes)) >= 0) {
outBytes.write(inBytes, 0, length);
}
final byte[] rawInput = outBytes.toByteArray();
final String ascii = new String(rawInput, ASCII);
final String utf8 = new String(rawInput, UTF8);
final String isoLatin1 = new String(rawInput, ISO8859_1);
log.info("ASCII: " + ascii);
log.info("UTF8: " + utf8);
log.info("ISOLATIN1: " + isoLatin1);
return utf8;
}
}
如果在HTTP请求的内容类型标头中未定义字符集,则resteasy会假定为“charset = US-ASCII”。 请参阅org.jboss.resteasy.plugins.providers.multipart.InputPart:
/**
* If there is a content-type header without a charset parameter, charset=US-ASCII
* is assumed.
* <p>
* This can be overwritten by setting a different String value in
* {@link org.jboss.resteasy.spi.HttpRequest#setAttribute(String, Object)}
* with this ("resteasy.provider.multipart.inputpart.defaultCharset")
* String`enter code here` as key. It should be done in a
* {@link org.jboss.resteasy.spi.interception.PreProcessInterceptor}.
* </p>
*/
因此,作为一种解决方法,您可以执行以下操作:
@Provider
@ServerInterceptor
public class CharsetPreProcessInterceptor implements PreProcessInterceptor {
@Override
public ServerResponse preProcess(HttpRequest request, ResourceMethod method) throws Failure, WebApplicationException {
request.setAttribute(InputPart.DEFAULT_CHARSET_PROPERTY, "charset=UTF-8");
return null;
}
}
我一般不会依靠getBody
的方法InputPart
。 您实际上可以将每个部分作为原始输入流并自己读取数据。 而不是依靠框架将内容转换为String。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.