[英]Azure Databricks Jupyter Notebook Python and R in one cell
I have some code (mostly not my original code), that I have running on my local PC Anaconda Jupyter Notebook environment.我有一些代码(大部分不是我的原始代码),我在本地 PC Anaconda Jupyter Notebook 环境上运行。 I need to scale up the processing so I am looking into Azure Databricks.我需要扩大处理范围,所以我正在研究 Azure Databricks。 There's one section of code that's running a Python loop but utilizes an R library (stats), then passes the data through an R model (tbats).有一段代码运行 Python 循环但使用 R 库 (stats),然后通过 R model (tbats) 传递数据。 So one Jupyter Notebook cell runs python and R code.因此,一个 Jupyter Notebook 单元运行 python 和 R 代码。 Can this be done in Azure Databricks Notebooks as well?这也可以在 Azure Databricks 笔记本中完成吗? I only found documentation that lets you change languages from cell to cell.我只找到了可以让您在不同单元格之间更改语言的文档。
In a previous cell I have:在之前的单元格中,我有:
%r libarary(stats)
So the library stats is imported (along with other R libraries).所以库统计信息被导入(连同其他 R 库)。 However when I run the code below, I get但是,当我运行下面的代码时,我得到
NameError: name 'stats' is not defined NameError: 名称 'stats' 未定义
I am wondering if it's the way Databricks wants you to tell the cell the language you're using (eg %r, %python, etc.).我想知道这是否是 Databricks 希望您告诉单元格您正在使用的语言(例如 %r、%python 等)的方式。
My Python code:我的 Python 代码:
for customerid, dataForCustomer in original.groupby(by=['customer_id']):
startYear = dataForCustomer.head(1).iloc[0].yr
startMonth = dataForCustomer.head(1).iloc[0].mnth
endYear = dataForCustomer.tail(1).iloc[0].yr
endMonth = dataForCustomer.tail(1).iloc[0].mnth
#Creating a time series object
customerTS = stats.ts(dataForCustomer.usage.astype(int),
start=base.c(startYear,startMonth),
end=base.c(endYear, endMonth),
frequency=12)
r.assign('customerTS', customerTS)
##Here comes the R code piece
try:
seasonal = r('''
fit<-tbats(customerTS, seasonal.periods = 12,
use.parallel = TRUE)
fit$seasonal
''')
except:
seasonal = 1
# APPEND DICTIONARY TO LIST (NOT DATA FRAME)
df_list.append({'customer_id': customerid, 'seasonal': seasonal})
print(f' {customerid} | {seasonal} ')
seasonal_output = pa.DataFrame(df_list)
If you change languages in databricks you will not be able to get the variables of the previous language如果您更改数据块中的语言,您将无法获得以前语言的变量
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.