简体   繁体   中英

Java fast stream copy with ISO-8859-1

I have the following code, which will read in files in ISO-8859-1, as thats what is required in this application,

private static String readFile(String filename) throws IOException {




String lineSep = System.getProperty("line.separator");
File f = new File(filename);
StringBuffer sb = new StringBuffer();
if (f.exists()) {
 BufferedReader br =
 new BufferedReader(
   new InputStreamReader(
              new FileInputStream(filename), "ISO-8859-1"));

 String nextLine = "";
 while ((nextLine = br.readLine()) != null) {
   sb.append(nextLine+ " ");
   // note:  BufferedReader strips the EOL character.
  // sb.append(lineSep);
 }
  br.close();
}

return sb.toString();
}

The problem is it is pretty slow. I have this function, which is MUCH faster, but I can not seem to find how to place the character encoding:

private static String fastStreamCopy(String filename)
{
   String s = "";
FileChannel fc = null;
try
{
    fc = new FileInputStream(filename).getChannel();



    MappedByteBuffer byteBuffer = fc.map(FileChannel.MapMode.READ_ONLY, 0, fc.size());

    int size = byteBuffer.capacity();
    if (size > 0)
        {

            byteBuffer.clear();
            byte[] bytes = new byte[size];
            byteBuffer.get(bytes, 0, bytes.length);
            s = new String(bytes);
        }

        fc.close();
    }
    catch (FileNotFoundException fnfx)
    {

        System.out.println("File not found: " + fnfx);
    }
    catch (IOException iox)
{

    System.out.println("I/O problems: " + iox);
   }
finally
    {
    if (fc != null)
        {
        try
            {
            fc.close();
            }
        catch (IOException ignore)
        {

        }
    }
    }
   return s;
}

Any one have an idea of where i should be putting the ISO encoding?

From the code you posted, you're not trying to "copy" the stream, but read it into a string.

You can simply provide the encoding in the String constructor :

s = new String(bytes, "ISO-88591-1");

Personally I'd just replace the whole method with a call to the Guava method Files.toString() :

String content = Files.toString(new File(filename), StandardCharsets.ISO_8859_1);

If you're using Java 6 or earlier, you'll need to use the Guava field Charsets.ISO_8859_1 instead of StandardCharsets.ISO_8859_1 (which was only introduced in Java 7).

However your use of the term "copy" suggests that you want to write the result to some other file (or stream). If that is true , then you don't need to care about the encoding at all, since you can just handle the byte[] directly and avoid the (unnecessary) conversion to and from String .

where you are converting bytes to string eg s = new String(bytes, encoding); or vice versa.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM