Azure Databricks Jupyter Notebook Python 和 R 在一个单元格中

Question

我有一些代码（大部分不是我的原始代码），我在本地 PC Anaconda Jupyter Notebook 环境上运行。 我需要扩大处理范围，所以我正在研究 Azure Databricks。 有一段代码运行 Python 循环但使用 R 库 (stats)，然后通过 R model (tbats) 传递数据。 因此，一个 Jupyter Notebook 单元运行 python 和 R 代码。 这也可以在 Azure Databricks 笔记本中完成吗？ 我只找到了可以让您在不同单元格之间更改语言的文档。

在之前的单元格中，我有：

%r libarary(stats)

所以库统计信息被导入（连同其他 R 库）。 但是，当我运行下面的代码时，我得到

NameError: 名称 'stats' 未定义

我想知道这是否是 Databricks 希望您告诉单元格您正在使用的语言（例如 %r、%python 等）的方式。

我的 Python 代码：

for customerid, dataForCustomer in original.groupby(by=['customer_id']):
    startYear = dataForCustomer.head(1).iloc[0].yr
    startMonth = dataForCustomer.head(1).iloc[0].mnth
    endYear = dataForCustomer.tail(1).iloc[0].yr
    endMonth = dataForCustomer.tail(1).iloc[0].mnth

    #Creating a time series object
    customerTS = stats.ts(dataForCustomer.usage.astype(int),
                      start=base.c(startYear,startMonth),
                      end=base.c(endYear, endMonth), 
                      frequency=12)
    r.assign('customerTS', customerTS)

    ##Here comes the R code piece
    try:
        seasonal = r('''
                    fit<-tbats(customerTS, seasonal.periods = 12, 
                                    use.parallel = TRUE)
                    fit$seasonal
                 ''')
    except: 
        seasonal = 1

    # APPEND DICTIONARY TO LIST (NOT DATA FRAME)
    df_list.append({'customer_id': customerid, 'seasonal': seasonal})
    print(f' {customerid} | {seasonal} ')

seasonal_output = pa.DataFrame(df_list)

Answer 1

如果您更改数据块中的语言，您将无法获得以前语言的变量

Azure Databricks Jupyter Notebook Python 和 R 在一个单元格中

问题描述

1 个解决方案

解决方案1
0 2022-03-11 19:31:24

Azure Databricks Jupyter Notebook Python 和 R 在一个单元格中

问题描述

1 个解决方案

解决方案1 0 2022-03-11 19:31:24

解决方案1
0 2022-03-11 19:31:24