简体   繁体   中英

Write R data as csv directly to s3

I would like to be able to write data directly to a bucket in AWS s3 from a data.frame \\ data.table object as a csv file without writing it to disk first using the AWS CLI.

obj.to.write.s3 <- data.frame(cbind(x1=rnorm(1e6),x2=rnorm(1e6,5,10),x3=rnorm(1e6,20,1)))

at the moment I write to csv first then upload to an existing bucket then remove the file using:

fn <- 'new-file-name.csv'
write.csv(obj.to.write.s3,file=fn)
system(paste0('aws s3 ',fn,' s3://my-bucket-name/',fn))
system(paste0('rm ',fn))

I would like a function that writes directly to s3? is that possible?

In aws.s3 0.2.2 the s3write_using() (and s3read_using() ) functions were added.

They make things much simpler:

s3write_using(iris, FUN = write.csv,
                    bucket = "bucketname",
                    object = "objectname")

The easiest solution is just to save the .csv in a tempfile() , which will be purged automatically when you close your R session.

If you need to only work in memory you can do this by doing write.csv() to a rawConnection:

# write to an in-memory raw connection
zz <- rawConnection(raw(0), "r+")
write.csv(iris, zz)

# upload the object to S3
aws.s3::put_object(file = rawConnectionValue(zz),
    bucket = "bucketname", object = "iris.csv")

# close the connection
close(zz)

In case you're unsure, you can then check that this worked correctly by downloading the object from S3 and reading it back into R:

# check that it worked
## (option 1: save locally)
save_object(object = "iris.csv", bucket = "bucketname", file = "iris.csv")
read.csv("iris.csv")
## (option 2: keep in memory)
read.csv(text = rawToChar(get_object(object = "iris.csv", bucket = "bucketname")))

Sure -- but 'saving to file' requires that your OS sees the desired target directory as an accessible filesystem. So in essence you "just" need to mount S3. Here is a quick Google search for that topic.

An alternative is writing to a temporary file, and then using whatever you use to transfer files. You could code up both operations as a simple helper function.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM