简体   繁体   English

Java从重定向的“友好”网址获取下载文件名

[英]Java get filename of download from redirected 'friendly' url

I'm trying to download a file from a given URL which may or may not be a direct link to the file.我正在尝试从给定的 URL 下载文件,该 URL 可能是也可能不是该文件的直接链接。 Does anyone know how I can detect the filename to write to if the URL is an indirect link (ie http://www.example.com/download.php?getFile=1 ) ?有谁知道如果 URL 是间接链接(即http://www.example.com/download.php?getFile=1 ),我如何检测要写入的文件名? It is no problem if the URL is a direct link to extract the filename from the URL and start writing to the extracted filename but with a redirect link the only method I have found so far is to write to an arbitrary filename - foo.txt - and then try and work with that.如果 URL 是从 URL 中提取文件名的直接链接并开始写入提取的文件名,则没有问题,但使用重定向链接,到目前为止我发现的唯一方法是写入任意文件名 - foo.txt -然后尝试使用它。 Problem is I really need the filename (and extension) to be correct.问题是我真的需要文件名(和扩展名)是正确的。 A sample of the code I am using is: (the section in the 'else' clause is neither finished nor working):我正在使用的代码示例是:(“else”子句中的部分既未完成也未工作):

public static boolean dlFile(String URL, String dest){
    try{
        URL grab = new URL(URL);
        ReadableByteChannel rbc = Channels.newChannel(grab.openStream());
        String fnRE = ".*/([a-zA-Z0-9\\-\\._]+)$";
            Pattern pattern = Pattern.compile(fnRE);
        Matcher matcher = pattern.matcher(URL);
        String fName = "";
        if(matcher.find()) fName = matcher.group(1);
        else { //filename cannot be extracted - do something here - below doesn't work raises MalformedURLExcpetion
            URL foo = new URL(URL);
            HttpURLConnection fooConnection = (HttpURLConnection) foo.openConnection();
            URL secondFoo = new URL(fooConnection.getHeaderField("Location"));
            System.out.println("Redirect URL: "+secondFoo);
            fooConnection.setInstanceFollowRedirects(false);
            URLConnection fooURL = secondFoo.openConnection();
        }
        System.out.println("Connection to "+URL+" established!");
        if(dest.endsWith("/")){}
        else dest+="/";
        System.out.println("Writing "+fName+" to "+dest);
        FileOutputStream fos = new FileOutputStream(dest+fName);
        fos.getChannel().transferFrom(rbc, 0, 1 << 24);

I am sure there must be a simple way to get the filename from the headers or something like that but I cannot work out how to get it.我相信一定有一种简单的方法可以从标题或类似的东西中获取文件名,但我不知道如何获取它。 Thanks in advance,提前致谢,

Assuming the response has a "Location" header field, I was able to obtain the direct link to a url containing multiple redirects like this:假设响应具有“位置”标头字段,我能够获得指向包含多个重定向的 url 的直接链接,如下所示:

String location = "http://www.example.com/download.php?getFile=1";
HttpURLConnection connection = null;
for (;;) {
    URL url = new URL(location);
    connection = (HttpURLConnection) url.openConnection();
    connection.setInstanceFollowRedirects(false);
    String redirectLocation = connection.getHeaderField("Location");
    if (redirectLocation == null) break;
    location = redirectLocation;
}
//and finally:
String fileName = location.substring(location.lastIndexOf('/') + 1, location.length());

I think its better to use Java Jsoup library, then use the below method:我认为最好使用 Java Jsoup库,然后使用以下方法:

public static void downloadFileJsoup(String URL, String PATH) throws IOException {
    Response res = Jsoup.connect(URL)
            .userAgent("Mozilla")
            .timeout(30000)
            .followRedirects(true)
            .ignoreContentType(true)
            .maxBodySize(20000000)//Increase value if download is more than 20MB
            .execute(); 
    String remoteFilename=res.header("Content-Disposition").replaceFirst("(?i)^.*filename=\"?([^\"]+)\"?.*$", "$1");
    String filename = PATH + remoteFilename;
    FileOutputStream out = (new FileOutputStream(new java.io.File(filename)));
    out.write( res.bodyAsBytes());
    out.close();
}

No, in general no way.不,一般没办法。 The response does'nt contain that information normally, since you do not add any own protocol information to the data stream (in case you can control the server).响应通常不包含该信息,因为您没有向数据流添加任何自己的协议信息(以防您可以控制服务器)。

Anyway, you ask for the file name extension.无论如何,您要求提供文件扩展名。 Maybe with the correct content-type you are done.也许使用正确的内容类型你就完成了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM