简体   繁体   English

通过 R 访问 Azure Blob 存储

[英]Access Azure Blob Storage through R

I'm trying to use R to make a connection to Azure Blob from where I have some CSV files stored.我正在尝试使用 R 从我存储了一些 CSV 文件的位置连接到 Azure Blob。 I need to load them into a data frame and make some transformations to them before I write them back to another Blob container.在将它们写回另一个 Blob 容器之前,我需要将它们加载到数据框中并对它们进行一些转换。 I'm trying to do this through Databricks so I can ultimately call this notebook from Data Factories and include it in a pipeline.我正在尝试通过 Databricks 执行此操作,因此我最终可以从 Data Factories 调用此笔记本并将其包含在管道中。

Databricks gives me a sample notebook in Python, where a connection can be made with the following code: Databricks 为我提供了一个 Python 示例笔记本,其中可以使用以下代码建立连接:

storage_account_name = "testname"
storage_account_access_key = "..."
file_location = "wasb://example@testname.blob.core.windows.net/testfile.csv"

spark.conf.set(
  "fs.azure.account.key."+storage_account_name+".blob.core.windows.net",
  storage_account_access_key)

df = spark.read.format('csv').load(file_location, header = True, inferSchema = True)

Is there something similar in R? R中有类似的东西吗? I can use the SparkR or Sparklyr package in R if it can help me load a file and place it in a Spark dataframe as well.我可以在 R 中使用 SparkR 或 Sparklyr 包,如果它可以帮助我加载文件并将其放入 Spark 数据帧中。

For your information, I have been informed that R is not capable of doing the actual mounting.供您参考,我已被告知 R 无法进行实际安装。 The workaround is to mount using another language like Python and read the file using the library "SparkR" as shown below.解决方法是使用另一种语言(如 Python)挂载并使用库“SparkR”读取文件,如下所示。

The two most commonly used libraries that provide an R interface to Spark are SparkR and sparklyr.为 Spark 提供 R 接口的两个最常用的库是 SparkR 和 sparklyr。 Databricks notebooks and jobs support both packages, although you cannot use functions from both SparkR and sparklyr with the same object. Databricks 笔记本和作业支持这两个包,但不能将 SparkR 和 sparklyr 的函数用于同一对象。

Mount using Python:使用 Python 挂载:

在此处输入图片说明

Run R notebook using the library “SparkR”:使用库“SparkR”运行 R notebook:

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM