使用 Windows 10 下载在线文件夹

Question

I wish to download an online folder using Windows 10 on my Dell laptop.我希望在我的Dell笔记本电脑上使用Windows 10下载在线folder 。 In this example the folder I wish to download is named Targetfolder .在此示例中，我希望下载的folder名为Targetfolder 。 I am trying to use the Command Window but also am wondering whether there is a simple solution in R .我正在尝试使用Command Window ，但也想知道R中是否有简单的解决方案。 I have included an image at the bottom of this post showing the target folder .我在这篇文章的底部添加了一张图片，显示了目标folder 。 I should add that Targetfolder includes a file and multiple subfolders containing files.我应该补充一点， Targetfolder包括一个文件和多个包含文件的子文件夹。 Not all files have the same extension.并非所有文件都具有相同的扩展名。 Also, please note this is a hypothetical site.另外，请注意这是一个假设站点。 I did not want to include the real site for privacy issues.我不想包括隐私问题的真实网站。

EDIT编辑

Here is a real site that can serve as a functional, reproducible example.这是一个真实的网站，可以作为一个功能性的、可重现的例子。 The folder rel2020 can take the place of the hypothetical Targetfolder :文件夹rel2020可以代替假设的Targetfolder ：

https://www2.census.gov/geo/docs/maps-data/data/rel2020/ https://www2.census.gov/geo/docs/maps-data/data/rel2020/

None of the answers here seem to work with Targetfolder :这里的答案似乎都不适用于Targetfolder ：

How to download HTTP directory with all files and sub-directories as they appear on the online files/folders list? 如何下载包含在线文件/文件夹列表中的所有文件和子目录的 HTTP 目录？

Below are my attempts based on answers posted at the link above and the result I obtained:以下是我根据上面链接中发布的答案的尝试以及我获得的结果：

Attempt One尝试一

lftp -c 'mirror --parallel=300 https://www.examplengo.org/datadisk/examplefolder/userdirs/user3/Targetfolder/ ;exit'

Returned:回来：

lftp is not recognized as an internal or external command, operable program or batch file.

Attempt Two尝试二

wget -r -np -nH --cut-dirs=3 -R index.html https://www.examplengo.org/datadisk/examplefolder/userdirs/user3/Targetfolder/

Returned:回来：

wget is not recognized as an internal or external command, operable program or batch file.

Attempt Three尝试三

https://sourceforge.net/projects/visualwget/files/latest/download https://sourceforge.net/projects/visualwget/files/latest/download

VisualWget returned Unsupported scheme next to the url . VisualWget在url旁边返回Unsupported scheme 。

Answer 1

Here is a way with packages httr and rvest .这是使用httr和rvest包的方法。
First get the folders where the files are from the link.首先从链接中获取文件所在的文件夹。
Then loop through the folders with Map , getting the filenames and downloading them in a lapply loop.然后使用Map遍历文件夹，获取文件名并在lapply循环中下载它们。
If errors such as time out conditions occur, they will be trapped in tryCatch .如果出现超时条件等错误，它们将被困在tryCatch中。 The last code lines will tell if and where there were errors.最后的代码行将告诉您是否以及在哪里出现错误。

Note: I only downloaded from folders[1:2] , in the Map below change this to folders .注意：我只从folders[1:2]下载，在下面的Map中将其更改为folders 。

suppressPackageStartupMessages({
  library(httr)
  library(rvest)
  library(dplyr)
})

link <- "https://www2.census.gov/geo/docs/maps-data/data/rel2020/"

page <- read_html(link)

folders <- page %>%
  html_elements("a") %>%
  html_attr("href") %>%
  .[8:14] %>%
  paste0(link, .)

files_txt <- Map(\(x) {
  x %>%
    read_html() %>%
    html_elements("a") %>%
    html_attr("href") %>%
    grep("\\.txt$", ., value = TRUE) %>%
    paste0(x, .) %>%
    lapply(\(y) {
      tryCatch(
        download.file(y, destfile = file.path("~/Temp", basename(y))),
        error = function(e) e
      )
    })
}, folders[1:2])

err <- sapply(unlist(files_txt, recursive = FALSE), inherits, "error")
lapply(unlist(files_txt, recursive = FALSE)[err], simpleError)

使用 Windows 10 下载在线文件夹

问题描述

1 个解决方案

解决方案1
1 2022-05-30 08:56:54

使用 Windows 10 下载在线文件夹

问题描述

1 个解决方案

解决方案1 1 2022-05-30 08:56:54

解决方案1
1 2022-05-30 08:56:54