Azure Databricks Jupyter Notebook Python 和 R 在一个单元格中

Question

I have some code (mostly not my original code), that I have running on my local PC Anaconda Jupyter Notebook environment.我有一些代码（大部分不是我的原始代码），我在本地 PC Anaconda Jupyter Notebook 环境上运行。 I need to scale up the processing so I am looking into Azure Databricks.我需要扩大处理范围，所以我正在研究 Azure Databricks。 There's one section of code that's running a Python loop but utilizes an R library (stats), then passes the data through an R model (tbats).有一段代码运行 Python 循环但使用 R 库 (stats)，然后通过 R model (tbats) 传递数据。 So one Jupyter Notebook cell runs python and R code.因此，一个 Jupyter Notebook 单元运行 python 和 R 代码。 Can this be done in Azure Databricks Notebooks as well?这也可以在 Azure Databricks 笔记本中完成吗？ I only found documentation that lets you change languages from cell to cell.我只找到了可以让您在不同单元格之间更改语言的文档。

In a previous cell I have:在之前的单元格中，我有：

%r libarary(stats)

So the library stats is imported (along with other R libraries).所以库统计信息被导入（连同其他 R 库）。 However when I run the code below, I get但是，当我运行下面的代码时，我得到

NameError: name 'stats' is not defined NameError: 名称 'stats' 未定义

I am wondering if it's the way Databricks wants you to tell the cell the language you're using (eg %r, %python, etc.).我想知道这是否是 Databricks 希望您告诉单元格您正在使用的语言（例如 %r、%python 等）的方式。

My Python code:我的 Python 代码：

for customerid, dataForCustomer in original.groupby(by=['customer_id']):
    startYear = dataForCustomer.head(1).iloc[0].yr
    startMonth = dataForCustomer.head(1).iloc[0].mnth
    endYear = dataForCustomer.tail(1).iloc[0].yr
    endMonth = dataForCustomer.tail(1).iloc[0].mnth

    #Creating a time series object
    customerTS = stats.ts(dataForCustomer.usage.astype(int),
                      start=base.c(startYear,startMonth),
                      end=base.c(endYear, endMonth), 
                      frequency=12)
    r.assign('customerTS', customerTS)

    ##Here comes the R code piece
    try:
        seasonal = r('''
                    fit<-tbats(customerTS, seasonal.periods = 12, 
                                    use.parallel = TRUE)
                    fit$seasonal
                 ''')
    except: 
        seasonal = 1

    # APPEND DICTIONARY TO LIST (NOT DATA FRAME)
    df_list.append({'customer_id': customerid, 'seasonal': seasonal})
    print(f' {customerid} | {seasonal} ')

seasonal_output = pa.DataFrame(df_list)

Answer 1

If you change languages in databricks you will not be able to get the variables of the previous language如果您更改数据块中的语言，您将无法获得以前语言的变量

Azure Databricks Jupyter Notebook Python 和 R 在一个单元格中

问题描述

1 个解决方案

解决方案1
0 2022-03-11 19:31:24

Azure Databricks Jupyter Notebook Python 和 R 在一个单元格中

问题描述

1 个解决方案

解决方案1 0 2022-03-11 19:31:24

解决方案1
0 2022-03-11 19:31:24