简体   繁体   English

R 中的贝叶斯建模

[英]Bayesian Modelling in R

I am trying to implement a bayesian model in R using bas package with setting up these values for my Model: I am trying to implement a bayesian model in R using bas package with setting up these values for my Model:

databas <- bas.lm(at_areabuilding ~ ., data = dataCOMMA, method = "MCMC", prior = "ZS-null", modelprior = uniform())

I am trying to predict area of a given state with the help of certain area present for that particular state;我试图借助特定 state 的特定区域来预测给定 state 的区域; but for different zip codes.但对于不同的 zip 代码。 My Model basically finds the various zip codes present in the data for a given state(using a state index for this) and then gives the output. My Model basically finds the various zip codes present in the data for a given state(using a state index for this) and then gives the output.

Now, Whenever I try to predict area of a state, I give this input:现在,每当我尝试预测 state 的面积时,我都会给出以下输入:

> UT <- data.frame(zip = 84321, loc_st_prov_cd = "UT" ,state_idx = 7)
> predict_1 <- predict(databas,UT, estimator="BMA", interval = "predict", se.fit=TRUE)
> data.frame('state' = 'UT','estimated area' = predict_1$Ybma)

Now, I get the output for this state.现在,我得到了这个 state 的 output。 Suppose I have a list of states with given zip codes and I want to run my Model (databas) on that list and get the predictions, I cannot do it by using the above approach as it will take time.假设我有一个具有给定 zip 代码的状态列表,并且我想在该列表上运行我的 Model (数据库)并获得预测,我无法使用上述方法来做到这一点,因为这需要时间。 Is there any other way to do the same?有没有其他方法可以做到这一点? I did the same by the help of one gentleman and here is my code:我在一位绅士的帮助下做了同样的事情,这是我的代码:

 pred <- sapply(1:nrow(first), function(row) { predict(basdata,first[row, ],estimator="BMA", interval = "predict", se.fit=TRUE)$Ybma })

basdata: My Model first: my new dataset for which I am predicting area. basdata:我的 Model 首先:我正在预测区域的新数据集。 Now, The issue that i am facing is that the code is taking a long time to predict the values.现在,我面临的问题是代码需要很长时间才能预测值。 It iterates over every row and calculates the area.它遍历每一行并计算面积。 There are 150000 rows in my dataset and I would request if anyone can help me optimizing the performance of this code.我的数据集中有 150000 行,我会请求是否有人可以帮助我优化此代码的性能。

Something like this will iterate over each row of your data frame of states, zips and indices (let's call it states_and_zips ) and return a list of predictions.像这样的东西将遍历状态、拉链和索引的数据帧的每一行(我们称之为states_and_zips )并返回一个预测列表。 Each element of this list (which I've called pred ) goes with the corresponding row of state_and_zips :这个列表的每个元素(我称之为pred )都与state_and_zips的相应行一起使用:

pred = lapply(1:nrow(states_and_zips), function(row) {
  predict(databas, ~ states_and_zips[row, ], 
          estimator="BMA", interval = "predict", se.fit=TRUE)$Ybma
})

If Ybma is a single value, then use sapply instead of lapply and it will return a vector of predictions, one for each row of state_and_zips that you can just add as a new column to states_and_zips .如果Ybma是单个值,则使用sapply而不是lapply ,它将返回一个预测向量,每个state_and_zips行一个,您可以将其作为新列添加到states_and_zips

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM