简体   繁体   English

使用 curl 或 RCurl 在 R 中下载和解压缩 JSON 文件

[英]Download & decompress JSON file in R using curl or RCurl

I have the following bash script to download & decompress a JSON file:我有以下 bash 脚本来下载和解压 JSON 文件:

#!/bin/sh -ex

# Ensure data directory (or a link) exists.
test -e results || mkdir results

# Download and decompress data.
curl -u $GISAID_USERNAME:$GISAID_PASSWORD --retry 4 \
  https://www.epicov.org/epi3/3p/$GISAID_FEED/export/provision.json.xz \
  | xz -d -T8 > results/gisaid.json

Ideally I would like to have an R function to download & decompress this file in a given directory, with the environment variables above $GISAID_USERNAME, $GISAID_PASSWORD & $GISAID_FEED passed as arguments.理想情况下,我希望有一个 R 函数来下载和解压缩给定目录中的这个文件,环境变量高于 $GISAID_USERNAME、$GISAID_PASSWORD 和 $GISAID_FEED 作为参数传递。 Would anyone know how to accomplish this, eg using package curl or RCurl ?有谁知道如何做到这一点,例如使用包curlRCurl (It would also be OK not to decompress it and leave it as .json.xz, as I would be reading the file later using (也可以不解压缩并将其保留为 .json.xz,因为我稍后会使用

library(jsonlite)
GISAID_json <- jsonlite::stream_in(gzfile(".//data//GISAID_json//provision.json.xz"))

I don't really know if this solves your problem, but what speaks against just executing your terminal commands in R using the system function?我真的不知道这是否能解决您的问题,但是仅使用system函数在 R 中执行终端命令有什么不妥?

So just put your terminal call into system() and it should execute and create your file.因此,只需将您的终端调用放入system() ,它就会执行并创建您的文件。 Afterwards read in the file.然后读入文件。 Of course you would have to replace the $GISAID_USERNAME, $GISAID_PASSWORD with your actual information.当然,您必须将 $GISAID_USERNAME、$GISAID_PASSWORD 替换为您的实际信息。 If the login information or the url should be flexible, you can put together a string beforehand, since system() expects a string with the command to execute.如果登录信息或 url 应该是灵活的,您可以预先组合一个字符串,因为system()需要一个带有要执行的命令的字符串。

system("curl -u $GISAID_USERNAME:$GISAID_PASSWORD --retry 4 \
https://www.epicov.org/epi3/3p/$GISAID_FEED/export/provision.json.xz \
| xz -d -T8 > results/gisaid.json")

Afterwards just read in the (hopefully) created file.之后只需读入(希望)创建的文件。

Couldn't test with your setup, but for me eg this small example successfully creates a file:无法使用您的设置进行测试,但对我而言,例如这个小示例成功创建了一个文件:

system("curl https://raw.githubusercontent.com/SteffenMoritz/imputeTS/master/pkgdown/favicon/favicon.ico > /Users/Steve/Downloads/x.ico")

Something like this should work:这样的事情应该工作:

library(curl)
library(glue)

custom_curl <- function(user, pass, feed, dest) {
  custom_handle <- curl::new_handle()
  curl::handle_setopt(
    custom_handle,
    username = user,
    password = pass
  )
  
  url <- glue::glue(
    "https://www.epicov.org/epi3/3p/{feed}/export/provision.json.xz"
  )
  
  curl::curl_download(url, dest, handle = custom_handle)
}

custom_curl('my_user', 'xxxxxx', 'feed1', 'dest/filename.json.xz')

As I can't test in the real files and url, I'm not sure if little tinkering in the function is needed, but at least is a starter point for you.由于我无法在真实文件和 url 中进行测试,因此我不确定是否需要对该函数进行少量修改,但至少对您来说是一个起点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM