Java Wget Bz2文件

Question

我正在嘗試從Wikipedia上獲取一些bz2文件，我不在乎它們是否另存為bz2或解壓縮，因為我可以在本地將其解壓縮。

當我打電話時：

public static void getZip(String theUrl, String filename) throws IOException {
    URL gotoUrl = new URL(theUrl);
    try (InputStreamReader isr = new InputStreamReader(new BZip2CompressorInputStream(gotoUrl.openStream())); BufferedReader in = new BufferedReader(isr)) {
        StringBuffer sb = new StringBuffer();
        String inputLine;

        // grab the contents at the URL
        while ((inputLine = in.readLine()) != null) {
            sb.append(inputLine + "\r\n");
        }
        // write it locally
        Wget.createAFile(filename, sb.toString());
    } catch (MalformedURLException mue) {
        mue.printStackTrace();
    } catch (IOException ioe) {
        throw ioe;
    }
}

我得到了解壓縮文件的一部分，從不超過+-883K。
當我不使用BZip2CompressorInputStream ，例如：

public static void get(String theUrl, String filename) throws IOException {
    try {
        URL gotoUrl = new URL(theUrl);
        InputStreamReader isr = new InputStreamReader(gotoUrl.openStream());
        BufferedReader in = new BufferedReader(isr);

        StringBuffer sb = new StringBuffer();
        String inputLine;

        // grab the contents at the URL
        while ((inputLine = in.readLine()) != null) {
            sb.append(inputLine);// + "\r\n");
        }
        // write it locally
        Statics.writeOut(filename, false, sb.toString());
    } catch (MalformedURLException mue) {
        mue.printStackTrace();
    } catch (IOException ioe) {
        throw ioe;
    }
}

我得到一個文件，該文件的大小與假定的大小相同（相比於KB而不是B）。 但也有一條消息表明壓縮后的文件已損壞，同樣在使用byte []而不是readLine()時也是如此：

public static void getBytes(String theUrl, String filename) throws IOException {
    try {
        char [] cc = new char[1024];
        URL gotoUrl = new URL(theUrl);
        InputStreamReader isr = new InputStreamReader(gotoUrl.openStream());
        BufferedReader in = new BufferedReader(isr);

        StringBuffer sb = new StringBuffer();
        // grab the contents at the URL
        int n = 0;
        while (-1 != (n = in.read(cc))) {
            sb.append(cc);// + "\r\n");
        }
        // write it locally
        Statics.writeOut(filename, false, sb.toString());
    } catch (MalformedURLException mue) {
        mue.printStackTrace();
    } catch (IOException ioe) {
        throw ioe;
    }
}

最后，當我bzip2 inputstream和outputstream ，我得到了一個有效的bzip2文件，但是大小與第一個類似，使用：

public static void getWriteForBZ2File(String urlIn, final String filename) throws CompressorException, IOException {
    URL gotoUrl = new URL(urlIn);
    try (final FileOutputStream out = new FileOutputStream(filename);
            final BZip2CompressorOutputStream dataOutputStream = new BZip2CompressorOutputStream(out);
            final BufferedInputStream bis = new BufferedInputStream(gotoUrl.openStream());
            final CompressorInputStream input = new CompressorStreamFactory().createCompressorInputStream(bis);
            final BufferedReader br2 = new BufferedReader(new InputStreamReader(input))) {
        String line = null;
        while ((line = br2.readLine()) != null) {
            dataOutputStream.write(line.getBytes());
        }
    }
}

那么，如何獲得bz2格式或未壓縮格式的整個bz2文件？

Answer 1

bz2文件包含字節，而不是字符。 您無法使用閱讀器讀取它是否包含字符。

由於您要做的就是下載文件並將其保存在本地，因此您所需要做的就是

Files.copy(gotoUrl.openStream(), Paths.get(fileName));

Java Wget Bz2文件

問題描述

1 個解決方案

解決方案1
2 已采納 2015-08-15 12:11:08

Java Wget Bz2文件

問題描述

1 個解決方案

解決方案1 2 已采納 2015-08-15 12:11:08

解決方案1
2 已采納 2015-08-15 12:11:08