简体   繁体   中英

Non-ASCII Characters Not Displayed When Reading From GridFS

I am uploading files (of different content types) using Apache fileupload API as follows:

FileItemFactory factory = getFileItemFactory(request.getContentLength());
ServletFileUpload uploader = new ServletFileUpload(factory);
uploader.setSizeMax(maxSize);
uploader.setProgressListener(listener);

List<FileItem> uploadedItems = uploader.parseRequest(request);

... saving files to GridFS using the following method:

public String saveFile(InputStream is, String contentType) throws UnknownHostException, MongoException {
    GridFSInputFile in = getFileService().createFile(is);
    in.setContentType(contentType);
    in.save();
    ObjectId key = (ObjectId) in.getId();
    return key.toStringMongod();
}

... calling saveFile() as follows:

saveFile(fileItem.getInputStream(), fileItem.getContentType())

and reading from GridFS using the following method:

public void writeFileTo(String key, HttpServletResponse resp) throws IOException {
    GridFSDBFile out = getFileService().findOne(new ObjectId(key));
    if (out == null) {
        throw new FileNotFoundException(key);
    }
    resp.setContentType(out.getContentType());
    out.writeTo(resp.getOutputStream());
}

My servlet code to download the file:

protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
    String uri = req.getRequestURI();

    String[] uriParts = uri.split("/");  // expecting "/content/[key]"

    // third part should be the key
    if (uriParts.length == 3) {
        try {
            resp.setDateHeader("Expires", System.currentTimeMillis() + (CACHE_AGE * 1000L));
            resp.setHeader("Cache-Control", "max-age=" + CACHE_AGE);
            resp.setCharacterEncoding("UTF-8");

            fileStorageService.writeFileTo(uriParts[2], resp);
        }
        catch (FileNotFoundException fnfe) {
            resp.sendError(HttpServletResponse.SC_NOT_FOUND);
        }
        catch (IOException ioe) {
            resp.sendError(HttpServletResponse.SC_INTERNAL_SERVER_ERROR);
        }
    }
    else {
        resp.sendError(HttpServletResponse.SC_BAD_REQUEST);
    }
}

However; all non-ASCII characters are displayed as '?' on a web page with encoding set to UTF-8 using:

<meta http-equiv="content-type" content="text/html; charset=UTF-8">

Any help would be greatly appreciated!

Apologies for taking your time! This was my mistake. There is nothing wrong with the code or GridFS. My test file's encoding was wrong.

resp.setContentType("text/html; charset=UTF-8");

Reason: only content type, together with a binary InputStream are passed on.

public void writeFileTo(String key, HttpServletResponse resp) throws IOException {
    GridFSDBFile out = getFileService().findOne(new ObjectId(key));
    if (out == null) {
        throw new FileNotFoundException(key);
    }
    resp.setContentType(out.getContentType()); // This might be a conflict
    out.writeTo(resp.getOutputStream());

}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM