簡體   English   中英

Java下載目錄中的所有文件和文件夾

[英]Java download all files and folders in a directory

我正在嘗試從該目錄下載所有文件。 但是,我只能將其下載為一個文件。 我能做什么? 我嘗試搜索此問題,結果令人困惑,人們開始建議改用httpclients。 感謝您的幫助,到目前為止,這是我的代碼。 已經建議我使用輸入流來獲取目錄中的所有文件。 那會進入數組嗎? 我在http://docs.oracle.com/javase/tutorial/networking/urls/上嘗試了該教程,但並沒有幫助我理解。

//ProgressBar/Install
            String URL_LOCATION = "http://www.futureretrogaming.tk/gamefiles/ProfessorPhys/";
            String LOCAL_FILE = filelocation.getText() + "\\ProfessorPhys\\";
            try {
                java.net.URL url = new URL(URL_LOCATION);
                HttpURLConnection connection = (HttpURLConnection) url.openConnection(); 
                connection.addRequestProperty("User-Agent", "Mozilla/4.76"); 
                //URLConnection connection = url.openConnection();
                BufferedInputStream stream = new BufferedInputStream(connection.getInputStream());
                int available = stream.available();
                byte b[]= new byte[available];
                stream.read(b);
                File file = new File(LOCAL_FILE);
                OutputStream out  = new FileOutputStream(file);
                out.write(b);
            } catch (Exception e) {
                System.err.println(e);
            }

我還發現此代碼將返回要下載的文件列表。 有人可以幫助我將這兩個代碼結合嗎?

public class GetAllFilesInDirectory {

public static void main(String[] args) throws IOException {

    File dir = new File("dir");

    System.out.println("Getting all files in " + dir.getCanonicalPath() + " including those in subdirectories");
    List<File> files = (List<File>) FileUtils.listFiles(dir, TrueFileFilter.INSTANCE, TrueFileFilter.INSTANCE);
    for (File file : files) {
        System.out.println("file: " + file.getCanonicalPath());
    }

}

}

您需要下載頁面,這是目錄列表,對其進行解析,然后下載該頁面中鏈接的單個文件...

你可以做類似...

URL url = new URL("http:www.futureretrogaming.tk/gamefiles/ProfessorPhys");
InputStream is = null;
try {
    is = url.openStream();
    byte[] buffer = new byte[1024];
    int bytesRead = -1;
    StringBuilder page = new StringBuilder(1024);
    while ((bytesRead = is.read(buffer)) != -1) {
        page.append(new String(buffer, 0, bytesRead));
    }
    // Spend the rest of your life using String methods
    // to parse the result...
} catch (IOException ex) {
    ex.printStackTrace();
} finally {
    try {
        is.close();
    } catch (Exception e) {
    }
}

或者,您可以下載Jsoup並使用它來完成所有艱苦的工作...

try {
    Document doc = Jsoup.connect("http:www.futureretrogaming.tk/gamefiles/ProfessorPhys").get();
    Elements links = doc.getElementsByTag("a");
    for (Element link : links) {
        System.out.println(link.attr("href") + " - " + link.text());
    }
} catch (IOException ex) {
    ex.printStackTrace();
}

哪個輸出...

?C=N;O=D - Name
?C=M;O=A - Last modified
?C=S;O=A - Size
?C=D;O=A - Description
/gamefiles/ - Parent Directory
Assembly-CSharp-Editor-firstpass-vs.csproj - Assembly-CSharp-Edit..>
Assembly-CSharp-Editor-firstpass.csproj - Assembly-CSharp-Edit..>
Assembly-CSharp-Editor-firstpass.pidb - Assembly-CSharp-Edit..>
Assembly-CSharp-firstpass-vs.csproj - Assembly-CSharp-firs..>
Assembly-CSharp-firstpass.csproj - Assembly-CSharp-firs..>
Assembly-CSharp-firstpass.pidb - Assembly-CSharp-firs..>
Assembly-CSharp-vs.csproj - Assembly-CSharp-vs.c..>
Assembly-CSharp.csproj - Assembly-CSharp.csproj
Assembly-CSharp.pidb - Assembly-CSharp.pidb
Assembly-UnityScript-Editor-firstpass-vs.unityproj - Assembly-UnityScript..>
Assembly-UnityScript-Editor-firstpass.pidb - Assembly-UnityScript..>
Assembly-UnityScript-Editor-firstpass.unityproj - Assembly-UnityScript..>
Assembly-UnityScript-firstpass-vs.unityproj - Assembly-UnityScript..>
Assembly-UnityScript-firstpass.pidb - Assembly-UnityScript..>
Assembly-UnityScript-firstpass.unityproj - Assembly-UnityScript..>
Assembly-UnityScript-vs.unityproj - Assembly-UnityScript..>
Assembly-UnityScript.pidb - Assembly-UnityScript..>
Assembly-UnityScript.unityproj - Assembly-UnityScript..>
Assets/ - Assets/
Library/ - Library/
Professor%20Phys-csharp.sln - Professor Phys-cshar..>
Professor%20Phys.exe - Professor Phys.exe
Professor%20Phys.sln - Professor Phys.sln
Professor%20Phys.userprefs - Professor Phys.userp..>
Professor%20Phys_Data/ - Professor Phys_Data/
Script.doc - Script.doc
~$Script.doc - ~$Script.doc
~WRL0392.tmp - ~WRL0392.tmp
~WRL1966.tmp - ~WRL1966.tmp

然后,您需要為每個文件建立一個新的URL並按已完成的方式閱讀...

例如, Assembly-CSharp-Edit..>hrefAssembly-CSharp-Editor-firstpass-vs.csproj ,它出現在相對鏈接中,因此您需要在其Assembly-CSharp-Editor-firstpass-vs.csproj加上http://www.futureretrogaming.tk/gamefiles/ProfessorPhys創建新的URL ,即http://www.futureretrogaming.tk/gamefiles/ProfessorPhys/Assembly-CSharp-Editor-firstpass-vs.csproj

您需要對要抓取的每個元素執行此操作

您是否考慮過HTTrack之類的工具,它可以檢測HTML上錨標記的存在並下載整個網站(受樹級別限制)。 您還可以指定過濾器應下載哪些文件等

如果這不符合您的要求,則您仍然可以使用手寫Java程序,除非問題是獲取URL中的文件列表(及其中的所有子文件夾)。 您需要解析HTML,收集所有的定位標記,然后遍歷它(這是HTTrack所做的)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM