简体   繁体   English

Java Wget Bz2文件

[英]Java Wget Bz2 file

I'm trying to webget some bz2 files from Wikipedia, I don't care whether they are save as bz2 or unpacked, since I can unzip them locally. 我正在尝试从Wikipedia上获取一些bz2文件,我不在乎它们是否另存为bz2或解压缩,因为我可以在本地将其解压缩。

When I call: 当我打电话时:

public static void getZip(String theUrl, String filename) throws IOException {
    URL gotoUrl = new URL(theUrl);
    try (InputStreamReader isr = new InputStreamReader(new BZip2CompressorInputStream(gotoUrl.openStream())); BufferedReader in = new BufferedReader(isr)) {
        StringBuffer sb = new StringBuffer();
        String inputLine;

        // grab the contents at the URL
        while ((inputLine = in.readLine()) != null) {
            sb.append(inputLine + "\r\n");
        }
        // write it locally
        Wget.createAFile(filename, sb.toString());
    } catch (MalformedURLException mue) {
        mue.printStackTrace();
    } catch (IOException ioe) {
        throw ioe;
    }
}

I get a part of the unzipped file, never more than +- 883K. 我得到了解压缩文件的一部分,从不超过+-883K。
When I don't use the BZip2CompressorInputStream , like: 当我不使用BZip2CompressorInputStream ,例如:

public static void get(String theUrl, String filename) throws IOException {
    try {
        URL gotoUrl = new URL(theUrl);
        InputStreamReader isr = new InputStreamReader(gotoUrl.openStream());
        BufferedReader in = new BufferedReader(isr);

        StringBuffer sb = new StringBuffer();
        String inputLine;

        // grab the contents at the URL
        while ((inputLine = in.readLine()) != null) {
            sb.append(inputLine);// + "\r\n");
        }
        // write it locally
        Statics.writeOut(filename, false, sb.toString());
    } catch (MalformedURLException mue) {
        mue.printStackTrace();
    } catch (IOException ioe) {
        throw ioe;
    }
}

I get a file of which the size is the same as it suppose to (compared to the KB not B). 我得到一个文件,该文件的大小与假定的大小相同(相比于KB而不是B)。 But also a message that that the zipped file is damaged, also when using byte [] instead of readLine() , like: 但也有一条消息表明压缩后的文件已损坏,同样在使用byte []而不是readLine()时也是如此:

public static void getBytes(String theUrl, String filename) throws IOException {
    try {
        char [] cc = new char[1024];
        URL gotoUrl = new URL(theUrl);
        InputStreamReader isr = new InputStreamReader(gotoUrl.openStream());
        BufferedReader in = new BufferedReader(isr);

        StringBuffer sb = new StringBuffer();
        // grab the contents at the URL
        int n = 0;
        while (-1 != (n = in.read(cc))) {
            sb.append(cc);// + "\r\n");
        }
        // write it locally
        Statics.writeOut(filename, false, sb.toString());
    } catch (MalformedURLException mue) {
        mue.printStackTrace();
    } catch (IOException ioe) {
        throw ioe;
    }
}

Finally, when I bzip2 the inputstream and outputstream , I get a valid bzip2 file, but of the size like the first one, using: 最后,当我bzip2 inputstreamoutputstream ,我得到了一个有效的bzip2文件,但是大小与第一个类似,使用:

public static void getWriteForBZ2File(String urlIn, final String filename) throws CompressorException, IOException {
    URL gotoUrl = new URL(urlIn);
    try (final FileOutputStream out = new FileOutputStream(filename);
            final BZip2CompressorOutputStream dataOutputStream = new BZip2CompressorOutputStream(out);
            final BufferedInputStream bis = new BufferedInputStream(gotoUrl.openStream());
            final CompressorInputStream input = new CompressorStreamFactory().createCompressorInputStream(bis);
            final BufferedReader br2 = new BufferedReader(new InputStreamReader(input))) {
        String line = null;
        while ((line = br2.readLine()) != null) {
            dataOutputStream.write(line.getBytes());
        }
    }
}

So, how do I get the entire bz2 file, in either bz2 format or unzipped? 那么,如何获得bz2格式或未压缩格式的整个bz2文件?

A bz2 file contains bytes, not characters. bz2文件包含字节,而不是字符。 You can't read it as if it contained characters, with a Reader. 您无法使用阅读器读取它是否包含字符。

Since all you want to do is download the file and save it locally, all you need is 由于您要做的就是下载文件并将其保存在本地,因此您所需要做的就是

Files.copy(gotoUrl.openStream(), Paths.get(fileName));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM