[英]Perform linear regression in R with data from SAP HANA database
I am trying to import the dataset into R to apply linear regression model, but am skeptical of my code as am new to R. The dataset is as follows with 5000+ rows of data: 我正在尝试将数据集导入R以应用线性回归模型,但我对我的代码表示怀疑,因为它是R的新功能。数据集如下,包含5000多个数据行:
power consumption
cputi
dbsu
power consumption
cputi
dbsu
as the column names and the followings integers as their values in the above column: 作为列名,以下整数作为它们在上一列中的值:
132
25
654
132
25
654
The sql code to call R function which I wrote is 调用我编写的R函数的sql代码是
CREATE COLUMN TABLE "PREDICTIVE ANALYSIS" LIKE "ANAGAPPAN.POWER_CONSUMPTION" WITH NO DATA;
SELECT POWER_APP, POWER_DB,CPUTI,DBTI,DBSU
FROM "ANAGAPPAN.POWER_CONSUMPTION";
DROP PROCEDURE USE_LM;
CREATE PROCEDURE USE_LM( IN train "ANAGAPPAN.POWER_CONSUMPTION", OUT result "PREDICTIVE ANALYSIS")
LANGUAGE
RLANG AS
BEGIN
library(lm)
model_app <- lm( POWER_APP ~ CPUTI + DBTI + DBSU + KBYTES_TRANSFERRED, data = train )
colnames(datOut) <- c("POWER_APP", "CPUTI", "DBTI", "DBSU", "DBSU")
PREDICTIVE ANALYSIS <- as.data.frame( lm(model_App))
END;
The result I obtain is it says the procedure is created but am unable to call the linear model on the data, how would I initiate the linear model? 我得到的结果是说程序已创建但无法在数据上调用线性模型,我将如何启动线性模型?
Although I'm not familiar with SAP products, I will have a stab at the R code I assume is between BEGIN
and END;
尽管我对SAP产品不熟悉,但是我会刺痛我认为介于BEGIN
和END;
之间的R代码END;
. 。
library(lm)
is incorrect, as mentioned by @Olli. 是不正确的,如@Olli所述。 To access R's linear model capabilities, you have to call - nothing. 要访问R的线性模型功能,您必须调用-无。 It's loaded by default through stats
package (this may not be true if R is called in --vanilla
mode. 默认情况下是通过stats
包加载的(如果在--vanilla
模式下调用R,则可能不正确。
model_app <- lm( POWER_APP ~ CPUTI + DBTI + DBSU + KBYTES_TRANSFERRED, data = train )
appears to be OK, at least from a syntax's point of view. 至少从语法的角度来看似乎是可以的。
For 对于
colnames(datOut) <- c("POWER_APP", "CPUTI", "DBTI", "DBSU", "DBSU")
I can't see where you define datOut
. 我看不到您在哪里定义datOut
。 If this variable is not created by the database, it does not exist and R should complain along the lines of 如果该变量不是由数据库创建的,则该变量不存在,R应该沿着以下方式抱怨:
Error in colnames(notExist) <- "x" : object 'notExist' not found
I will assume you want to predict (means) based on a model. 我假设您要基于模型进行预测(均值)。 Line 线
PREDICTIVE ANALYSIS <- as.data.frame( lm(model_App))
will not work because R's variables should not have spaces, as.data.frame
will not work on a lm
object and model_App
doesn't exist (notice the case). 不会起作用,因为的r变量不能有空格, as.data.frame
不会在上工作lm
对象和model_App
不存在(通知的情况下)。 I think you should do something along the lines of 我认为您应该采取以下措施
# based on http://help.sap.com/hana/sap_hana_r_integration_guide_en.pdf
# you have to specify variable result which will be exported to the database
result <- as.data.frame(predict(model_app))
You can try it out. 您可以尝试一下。
x <- 1:10
y <- rnorm(10)
mdl <- lm(y ~ x)
as.data.frame(predict(mdl))
predict(mdl)
1 0.47866685
2 0.34418219
3 0.20969753
4 0.07521287
5 -0.05927180
6 -0.19375646
7 -0.32824112
8 -0.46272579
9 -0.59721045
10 -0.73169511
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.