简体   繁体   中英

Fileoutputstream and UTF-8 File Download

I want to download files by using InputStream and FileOutputStream . My code looks like this:

URL obj = new URL(url);
HttpURLConnection con = (HttpURLConnection) obj.openConnection();

// optional default is GET
con.setRequestMethod("GET");

con.setRequestProperty("Cache-Control", "no-cache");


int responseCode = con.getResponseCode();
System.out.println("\nSending 'GET' request to URL : " + url);
System.out.println("Response Code : " + responseCode);
try {
    InputStream inputStream = con.getInputStream();
    FileOutputStream outputStream = new FileOutputStream("C:\\programs\\TRYFILE.csv");

    int bytesRead = -1;
    byte[] buffer = new byte[4096];
    while ((bytesRead = inputStream.read(buffer)) != -1) {
        outputStream.write(buffer, 0, bytesRead);
    }

} catch(Exception e) {
    //
} finally {
    outputStream.close(); 
    inputStream.close();
}  

The code works well and downloads files. But I want to know that if a file includes Turkish characters (ş, Ğ, Ç, İ, Ö, etc), does this code download the file with those characters? So, I want to download the file with those characters (if they are included) and see those characters untouched in my file.

So, does this code work well with UTF-8?

None of your code attempts to convert to characters; you're passing bytes through unchanged, so there is no need to worry about encoding. Your code will work fine.

It's only when you use Reader s and Writer s that you have to worry about encoding.

Assuming that con is an instance of URLConnection , its getInputStream() will provide you a direct network stream reading the bytes as sent by the server. No conversion will be made. Since you are transferring the bytes directly to the file, they are stored in the files without any modification.

Assuming that the server sent the files using the UTF-8 encoding and that the tool you use to open the file afterwards uses the UTF-8 encoding as well, you will see all characters correctly. The same applies to any other encoding, as long as the server and the tool use the same encoding. Your program does not add anything to it as it simply transfers bytes , not characters .

By the way, such a transfer can be made much simpler using recent APIs:

try(ReadableByteChannel in=Channels.newChannel(con.getInputStream());
    FileChannel out=FileChannel.open(Paths.get("C:\\programs\\TRYFILE.csv"),
        StandardOpenOption.CREATE, StandardOpenOption.WRITE,
        StandardOpenOption.TRUNCATE_EXISTING)) {
    out.transferFrom(in, 0, Long.MAX_VALUE);
}

It gets even more readable when you use import static java.nio.file.StandardOpenOption.*; :

try(ReadableByteChannel in=Channels.newChannel(con.getInputStream());
    FileChannel out=FileChannel.open(Paths.get("C:\\programs\\TRYFILE.csv"),
                                     CREATE, WRITE, TRUNCATE_EXISTING) {
    out.transferFrom(in, 0, Long.MAX_VALUE);
}

If your file that youre reading is encoded in utf8 then your code will work fine. If its not then you can convert it to utf8 by using GNU iconv and then run your code. That should work.

Edit: When you want to write the data in utf-8 you have to wrap the FileOutputStream in an OutputStreamWriter and pass the encoding when creating it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM