简体   繁体   中英

Change encoding of uploaded MultipartFile in Spring Boot

I have an endpoint which receives a MultipartFile.

Resource upload(@PathVariable Integer id, @RequestParam MultipartFile file) throws IOException {

This file usually is a .csv that I need to process every line and save the data.

But recently an user send a file with UTF-16 LE encoding and this adds a lot of strange characters in the data.

I'd like to receive the file with any encoding and always force to my acceptable encoding, for example, UTF-8 , before process the file.

How can I do this?

After a few tests and search I found the solution.

To change the charset encode of a file I need to read and write the file applying the new target charset, but to create something generic which could receive any charset I need to identify the source charset.

To achieve that I add a dependency called UniversalDetector :

    <dependency>
        <groupId>com.github.albfernandez</groupId>
        <artifactId>juniversalchardet</artifactId>
        <version>2.3.1</version>
    </dependency>

Using it I could do this:

    encoding = UniversalDetector.detectCharset(file.getInputStream());
    if (encoding == null) {
        //throw exception
    }

And the method for transform the file:

   private static void encodeFileInLatinAlphabet(InputStream source, String fromEncoding, File target) throws IOException {
        try (BufferedReader reader = new BufferedReader(new InputStreamReader(source, fromEncoding));
             BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(target),
                     StandardCharsets.ISO_8859_1))) {
            char[] buffer = new char[16384];
            int read;
            while ((read = reader.read(buffer)) != -1)
                writer.write(buffer, 0, read);
        }
    }

So I could receive any charset and encode in the desired charset.

Note: In my case I always need the file in ISO_8859_1 so that why in the method is fixed, but you could receive the target charset as a parameter.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM