简体   繁体   中英

Unzipping a file in R on a Mac

I've reviewed multiple StackOverflow questions and answers and still can't exclusively use R get a .zip file successfully downloaded, unzipped, and loaded in R.

When I download the .zip folder manually, I see that it contains multiple files, one named loan.csv , that I need to analyze in R.

#set wd
wd <- "/Users/myname/Documents/zip_folder"

zip_url <- "https://www.kaggle.com/wendykan/lending-club-loan-data/downloads/lending-club-loan-data.zip"

I'm getting an error with the first answer I found here :

temp <- tempfile()
download.file(zip_url, temp)
data <- read.table(unz(temp, "loan.csv"))
Error in open.connection(file, "rt") : cannot open the connection
In addition: Warning message:
In open.connection(file, "rt") :
  cannot open zip file '/var/folders/b1/d481ykzd3j14kr8nkx8kn83m0000gn/T//RtmpcjmrIa/file932f730721c5'

Error in fread(unz(temp, "loan.csv")) : 
  'input' must be a single character string containing a file name, a command, full path to a file, a URL starting 'http[s]://', 'ftp[s]://' or 'file://', or the input data itself

I'm also getting an error using the 5th answer (Mac specific) to the SO question hyperlinked above:

loans <- fread("curl https://www.kaggle.com/wendykan/lending-club-loan-data/downloads/lending-club-loan-data.zip | tar -xf- --to-stdout *loan.csv")

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                             Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100   149  100   149    0     0    334      0 --:--:-- --:--:-- --:--:--   334
tar: Unrecognized archive format
tar: *loans.csv: Not found in archive
tar: Error exit delayed from previous errors.

Error in fread("curl https://www.kaggle.com/wendykan/lending-club-loan-data/downloads/lending-club-loan-data.zip | tar -xf- --to-stdout *loans.csv") : 
  File is empty: /var/folders/b1/d481ykzd3j14kr8nkx8kn83m0000gn/T//RtmpcjmrIa/file932f299c7cc4

The multiple failures have various reasons:

  1. fread doesn't work with unz . It does work with read.table .
  2. fread does work with more extensive shell commands, but you cannot un tar a ZIP file because it's not a TAR archive. You can use funzip , as suggested in the same answer (but only if your ZIP archive contains just a single file).

… you could also simply use the unzip R function.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM