简体   繁体   中英

Import excel from Azure blob using R

I have the basic setup done following the link below:

http://htmlpreview.github.io/?https://github.com/Microsoft/AzureSMR/blob/master/inst/doc/tutorial.html

There is a method 'azureGetBlob' which allows you to retrieve objects from the containers. however, it seems to only allow "raw" and "text" format which is not very useful for excel. I've tested the connections and etc, I can retrieve .txt / .csv files but not .xlsx files.

Does anyone know any workaround for this?

Thanks

Does anyone know any workaround for this?

There is no file type on the azure blob storage, it is just a blob name. The extension type is known for OS. If we want to open the excel file in the r, we could use the 3rd library to do that such as readXl .

Work around:

You could use the get blob api to download the blob file to local path then use readXl to read the file. We also get could more demo code from this link .

# install
install.packages("readxl")
# Loading
library("readxl")
# xls files
my_data <- read_excel("my_file.xls")
# xlsx files
my_data <- read_excel("my_file.xlsx")

Solved with the following code. Basically, read the file in byte then wrote the file to disk then read it into R

excel_bytes <- azureGetBlob(sc, storageAccount = "accountname", container = "containername", blob=blob_name, type="raw") 
q <- tempfile()
f <- file(q, 'wb')
writeBin(excel_bytes, f)
close(f)
result <- read.xlsx(q, sheetIndex = sheetIndex)
unlink(q)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM