What is the easy way to access (read and write) files in blob storage in R scripts in Azure Machine Learning?
I can access files in blob storage in python scripts using azure modules, but there seems no easy way to access by R scripts.
I tried to import Azure SMR as a zip file in the R script, but the importing all dependencies is very tough work,
https://github.com/Microsoft/AzureSMR
Any suggestion and help is appreciated.
It sounds like you knew how to install & use R packages on Azure ML. If not, please see the document Installing R package in Azure Machine Learning and use R package to try again.
Per my experience, I think the R package AzureSMR
is not designed for only using Azure Storage, but for Resource Management. So it's not a good idea to use it in Azure ML, and you need to do more works which include register an app on Azure AD, etc, to make the code using its APIs works.
My suggestion is that trying to use the REST APIs of Azure Blob Storage via using a R package httr
in the Execute R Script
of Azure ML. You can refer to the SO thread Azure PUT Blob authentication fails in R to know how to do this. Meanwhile, the source code of AzureSMR
is very valuable for you to reuse & rewrite these common functions for authentication or doing the blob CRUD operations.
Hope it helps. Any concern, please feel free to let me know.
Thank you for your suggestion, Perter Pan.
I followed Azure PUT Blob authentication fails in R
However, the script runs fail. The error message was
error:1411809D:SSL routines:SSL_CHECK_SERVERHELLO_TLSEXT:tls invalid
ecpointformat list
I thought the problem may related to https access. (I write this because when using python script for accessing blob storage in Azure ML, I had also followed Access Azure blog storage from within an Azure ML experiment )
The same problem I found for R is Error:1411809D:SSL routines - When trying to make https call from inside R module in AzureML
So then, I changed the https to http. But the script tries to access the blob storage many times and never finish running. I can find the request number very increased in the storage of Azure portal.
My code is actually the similar to Azure PUT Blob authentication fails in R except that the request url changed to http
The script is blow.
library(httr)
account <- "accountname"
container <- "containrname"
filename <- "test.txt"
key <- "8FS+3i9eXx....r54Gl97F0nVwyDcV7lXbcWhmQ=="
object <- "Hello World"
url <- paste0("http://", account, ".blob.core.windows.net/", container,
"/", filename)
requestdate <- format(Sys.time(),"%a, %d %b %Y %H:%M:%S %Z", tz="GMT")
content_length <- nchar(object, type = "bytes")
signature_string <- paste0("PUT", "\n", # HTTP Verb
"\n", # Content-Encoding
"\n", # Content-Language
content_length, "\n", # Content-Length
"\n", # Content-MD5
"text/plain", "\n", # Content-Type
"\n", # Date
"\n", # If-Modified-Since
"\n", # If-Match
"\n", # If-None-Match
"\n", # If-Unmodified-Since
"\n", # Range
# Here comes the Canonicalized Headers
"x-ms-blob-type:BlockBlob","\n",
"x-ms-date:",requestdate,"\n",
"x-ms-version:2015-02-21","\n",
# Here comes the Canonicalized Resource
"/",account, "/",container,"/", filename)
headerstuff <- add_headers(Authorization=paste0("SharedKey
",account,":",
RCurl::base64(digest::hmac(key =
RCurl::base64Decode(key, mode = "raw"),
object = enc2utf8(signature_string),
algo = "sha256", raw = TRUE))),
`Content-Length` = content_length,
`x-ms-date`= requestdate,
`x-ms-version`= "2015-02-21",
`x-ms-blob-type`="BlockBlob",
`Content-Type`="text/plain")
content(PUT(url, config = headerstuff, body = object, verbose()), as =
"text")
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.