简体   繁体   English

Java-递归获取目录和子目录中的所有文件

[英]Java - Get all files in directories and subdirectories recursively

There is twist here. 这里有些曲折。 What I am working with is not a local directory. 我正在使用的不是本地目录。 I am trying to make a batch downloader for this link: 我正在尝试为此链接创建一个批量下载器:

https://files.secureserver.net/0fHCh0CLd6Az63 https://files.secureserver.net/0fHCh0CLd6Az63
https://files.secureserver.net/0fdAWETp4sONW5 https://files.secureserver.net/0fdAWETp4sONW5

Here each file and folder is assigned a unique id. 在这里,每个文件和文件夹都分配有唯一的ID。

- My goal is to get all file ids from all directories and subdirectories. -我的目标是从所有目录和子目录中获取所有文件ID。

I have managed to download files and view contents of folder if I have their id. 如果我有ID,就可以下载文件并查看文件夹的内容。

So the problem is that when I get contents of directory they contains subdirectories in it. 所以问题是,当我获取目录的内容时,它们在其中包含子目录。

So how can I get all file id in these directories and subdirectories recursively? 那么,如何递归获取这些目录和子目录中的所有文件ID?
Can this be solved with tree data structure or is there any easy method? 可以用树数据结构解决吗,或者有什么简单的方法吗?

Here is my code: 这是我的代码:



    package javaapplication1;

    import java.io.IOException;
    import java.util.Iterator;
    import java.util.TreeSet;
    import org.jsoup.Connection;
    import org.jsoup.Jsoup;
    import org.jsoup.nodes.Document;
    import org.jsoup.nodes.Element;
    import org.jsoup.select.Elements;

    public class GoDaddyDownloader2 {

        Document document;
        String openFolderUrl;
        String downloadFileUrl;
        String frameSrc;
        TreeSet folderTreeSet;
        TreeSet fileTreeSet;
        StringBuilder fileId;
        StringBuilder folderId;

        public GoDaddyDownloader2() {
            openFolderUrl = "";
            downloadFileUrl = "";
            frameSrc = "";
            folderTreeSet = new TreeSet();
            fileTreeSet = new TreeSet();
            fileId = new StringBuilder();
            folderId = new StringBuilder();
        }

        public void getUrl(String url) throws IOException {
            document = Jsoup.connect(url)
                    .userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0")
                    .get();
            frameSrc = document.getElementsByTag("iframe").attr("src");
            openFolderUrl = frameSrc.replace("display_folder", "get_listing");
            openFolderUrl = openFolderUrl.replace("public_folder", "public_folder_ajax");
            downloadFileUrl = frameSrc.replace("display_folder", "get_download_url");

            System.out.println(frameSrc);
            System.out.println(openFolderUrl);
            System.out.println(downloadFileUrl);
            getRootFolder();

        }

        public void getRootFolder() throws IOException {

            document = Jsoup.connect(frameSrc)
                    .userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0")
                    .get();

            getFileAndFilders();

            //getFolderById("686499839");
        }

        public void getFileAndFilders() {
            Elements mapElements = document.getElementsByTag("map");

            for (Element temp : mapElements) {
                //System.out.println(StringEscapeUtils.unescapeHtml4(temp.toString()));
                if (!temp.attr("folder_id").toString().contentEquals("")) {
    //                System.out.println("====>"  + temp.attr("folder_id").toString());
                    if (temp.attr("folder_id").toString().contains("\"")) {
                        folderId = new StringBuilder(temp.
                                attr("folder_id").toString().
                                substring(temp.attr("folder_id").toString().
                                        indexOf("\"") + 1,
                                        temp.attr("folder_id").toString().
                                        lastIndexOf("\"") - 1));
    //                    System.out.println(folderId);
                    } else {
                        folderTreeSet.add(temp.attr("folder_id").toString());
                    }
                } else if (!temp.attr("file_id").toString().contentEquals("")) {
    //                System.out.println("++++>"  + temp.attr("file_id").toString());
                    if (temp.attr("file_id").toString().contains("\"")) {
                        fileId = new StringBuilder(temp.
                                attr("file_id").toString().
                                substring(temp.attr("file_id").toString().
                                        indexOf("\"") + 1,
                                        temp.attr("file_id").toString().
                                        lastIndexOf("\"") - 1));
                        fileTreeSet.add(fileId.toString());
    //                    System.out.println(fileId);
                    } else {
                        fileTreeSet.add(temp.attr("file_id").toString());
                    }

                }
            }
        }

        public void getFolderById(String fid) throws IOException {
            document = Jsoup.connect(openFolderUrl)
                    .userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0")
                    .data("folder_id", fid)
                    .data("open_folder_id", "")
                    .data("view", "list")
                    .data("column_number", "0")
                    .data("sort_term", "name")
                    .data("sort_direction", "asc")
                    .data("offset", "0")
                    .method(Connection.Method.POST)
                    .execute().parse();

            getFileAndFilders();
        }

        public String downloadFileById(String fileId) throws IOException {

            String link = Jsoup.connect(downloadFileUrl)
                    .userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0")
                    .data("file_id", fileId)
                    .method(Connection.Method.POST)
                    .execute().parse().text();
            System.out.println(link);
            return link;
        }

        public static void main(String[] args) throws IOException {
            GoDaddyDownloader2 obj = new GoDaddyDownloader2();
            obj.getUrl("https://files.secureserver.net/0fHCh0CLd6Az63");

            //Contents of root directory
            Iterator i = obj.folderTreeSet.iterator();
            System.out.println("Folders");
            while (i.hasNext()) {
                String s = (String) i.next();
                System.out.println(s);
            }
            System.out.println("---------------");
            System.out.println("Files");
            i = obj.fileTreeSet.iterator();
            while (i.hasNext()) {
                String s = (String) i.next();
                System.out.println(s);
            }
            System.out.println("===============");

            //Adding Contents of first directory to TreeSet
            System.out.println("After adding contents of first directory");
            obj.getFolderById(obj.folderTreeSet.first().toString());
            System.out.println("Folders");
            i = obj.folderTreeSet.iterator();
            while (i.hasNext()) {
                String s = (String) i.next();
                System.out.println(s);
            }
            System.out.println("---------------");
            System.out.println("Files");
            i = obj.fileTreeSet.iterator();
            while (i.hasNext()) {
                String s = (String) i.next();
                System.out.println(s);
            }

            System.out.println("Generate file link");
            obj.downloadFileById(obj.fileTreeSet.first().toString());

        }
    }

I am using TreeSet to avoid duplication. 我正在使用TreeSet来避免重复。

The answer is in the question: since you want to do something recursively, the obvious way to do it is to use recursion. 答案就出在问题上:由于您要递归执行某项操作,因此显而易见的方法是使用递归。 Something like the following pseudo-code: 类似于以下伪代码:

public Set<Thing> downloadEverything(Directory directory) {
    Set<Thing> result = new HashSet<>();
    downloadEverything(directory, result);
}

private void downloadEverything(Directory directory, Set<Thing> result) {
    for (File file : getFilesOfDirectory() {
        result.add(downloadThingFromFile(file));
    }
    for (Directory subDirectory : getSubdirectoriesOfDirectory(directory) {
        downloadEverything(subDirectory, result);
    }
}

private Thing downloadThingFromFile(File file) {
    // ...
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 列出Java中目录和子目录中的所有文件 - list all files from directories and subdirectories in Java 递归获取Java中的文件和目录列表 - recursively get files and directories list in java 如何使用Java获取目录及其子目录中多个文件的相同xml元素值? - How to get the same xml element value of multiple files in a directories and its subdirectories using java? 递归监视Java中的目录和所有子目录 - Recursively monitor a directory and all sub directories in java JAVA-获取所有子目录的大小? - JAVA - get size of all subdirectories? Java:使用Apache commons-IO 2.4在子目录中递归列出文件 - Java: list files recursively in subdirectories with Apache commons-IO 2.4 在 Java 中递归压缩包含任意数量的文件和子目录的目录? - Recursively ZIP a directory containing any number of files and subdirectories in Java? 如何在Java中为目录的所有子目录递归运行脚本 - How to run script recursively for all the subdirectories of a directory in Java 如何从子目录中递归获取文件spring-integration - How to get files recursively from subdirectories spring-integration 如何使用Java从SVN存储库中获取所有文件和目录 - How to get all files and directories from the SVN repository using java
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM