繁体   English   中英

在闪亮的应用程序上向DataFrame添加新列

[英]Add new columns to a DataFrame on shiny apps

我正在使用有光泽的应用程序在R上构建一个简单的ML模型,该应用程序的结构为:

1)从本地文件加载数据2)使用加载的数据训练模型3)绘制结果

我的问题是在第2阶段 ,我可以使用以下代码绘制输入数据:

  output$plot1 <- renderPlot({
    ggplot(mydata(), aes(x=LotArea, y=SalePrice)) + geom_point()
  })

但是由于预测值不在原始DF中,因此我需要先添加它们。

我为此使用的代码是:

  obsB <- reactive({


    set.seed(0) 
    xgb_model = train(
      mydata()["LotArea"], as.vector(t(mydata()["SalePrice"])),  
      trControl = xgb_trcontrol,
      tuneGrid = xgbGrid,
      method = "xgbTree"
    )

    predicted = predict(xgb_model, mydata()["LotArea"])
    mydata()["predicted"] = predicted


  })

这是我得到的错误:

Warning: Error in FUN: object 'predicted' not found

当我将“ LotArea”更改为“ predicted”时,就会发生这种情况

  output$plot1 <- renderPlot({
    ggplot(mydata(), aes(x=predicted, y=SalePrice)) + geom_point()
  })

这是我的完整代码:

library(shiny)
library(readxl)
library(tidyverse)
library(xgboost)
library(caret)
library(iml)


#### UI


ui <- fluidPage(
  sidebarLayout(
    sidebarPanel(
      fileInput("file1", "Choose CSV File",
                accept = c(
                  "text/csv",
                  "text/comma-separated-values,text/plain",
                  ".csv")
      ),
      tags$hr(),
      checkboxInput("header", "Header", TRUE)
    ),
    mainPanel(
      #tableOutput("contents"),
      plotOutput("plot1", click = "plot_brush")
    )
  )
)

server <- function(input, output) {
  mydata <- reactive({
    req(input$file1, input$header, file.exists(input$file1$datapath))
    read.csv(input$file1$datapath, header = input$header)
  })


  output$contents <- renderTable({
    req(mydata())
    #mydata()
  })


  ### test
  xgb_trcontrol = trainControl(
    method = "cv",
    number = 5,  
    allowParallel = TRUE,
    verboseIter = FALSE,
    returnData = FALSE
  )


  #I am specifing the same parameters with the same values as I did for Python above. The hyperparameters to optimize are found in the website.
  xgbGrid <- expand.grid(nrounds = c(10,14),  # this is n_estimators in the python code above
                         max_depth = c(10, 15, 20, 25),
                         colsample_bytree = seq(0.5, 0.9, length.out = 5),
                         ## The values below are default values in the sklearn-api. 
                         eta = 0.1,
                         gamma=0,
                         min_child_weight = 1,
                         subsample = 1
  )




  obsB <- reactive({



    set.seed(0) 
    xgb_model = train(
      mydata()["LotArea"], as.vector(t(mydata()["SalePrice"])),  
      trControl = xgb_trcontrol,
      tuneGrid = xgbGrid,
      method = "xgbTree"
    )

    predicted = predict(xgb_model, mydata()["LotArea"])
    mydata()["predicted"] = predicted


  })

  output$plot1 <- renderPlot({
    ggplot(mydata(), aes(x=predicted, y=SalePrice)) + geom_point()
  })




  }

shinyApp(ui, server)

编辑:

我变了:

mydata()["predicted"] = predicted

对于:

data =  mydata()
data["predicted"] = predicted

但是知道我得到了另一个错误:

Warning: Error in : You're passing a function as global data.
Have you misspelled the `data` argument in `ggplot()

编辑2:这是我正在使用的数据的示例:

https://drive.google.com/file/d/1R8GA0fW0pOgG8Cpykc8mAThvKOCRCVl0/view?usp=sharing

您不能使用该语法更新反应式值。

您的问题是:

  • 如果使用value = reactive({...})创建无功值,则无法在该代码块之外更改其值
  • 如果要在代码中的多个位置更改反应性元素的值,则需要使用reactiveValreactiveValues函数来创建变量。 可以使用variableName(newValue)语法更改以此方式创建的variableName(newValue)

例如

# we are somewhere inside the server code
mydata = reactiveVal()

observe({
    req(input$file1, input$header, file.exists(input$file1$datapath))
    data = read.csv(input$file1$datapath, header = input$header)
    mydata(data)
})

obsB <- reactive({


    set.seed(0) 
    xgb_model = train(
      mydata()["LotArea"], as.vector(t(mydata()["SalePrice"])),  
      trControl = xgb_trcontrol,
      tuneGrid = xgbGrid,
      method = "xgbTree"
    )

    predicted = predict(xgb_model, mydata()["LotArea"])
    newData = mydata()
    newData['predicted'] = predicted
    mydata(newData)
  })
  • 否则,您需要在单个代码块中组合所有更改mydata

请注意,我怀疑上面的代码中可能存在循环,这是由于您在依赖于mydata的代码块中更新了mydata所致。 因为我没有样本数据,所以无法测试它,但是您可能不得不进行isolate或者使用另一个不是mydata触发器来使其正常工作(例如, mydata触发器)

附带说明一下,如果您的应用程序需要数据才能运行,那么提供示例数据对我们更好。 我无法对此进行测试,因为无法轻易猜测输入的外观。 另外,最好使用专门为该问题编写的新代码来隔离问题,而不是将实际项目粘贴到此处,因为这样可以消除数据和程序包的依赖性,并且不会产生与问题无关的干扰

由于您的代码存在其他与问题无关的问题,因此这里有一个带注释的固定版本

library(shiny)
library(readxl)
library(tidyverse)
library(xgboost)
library(caret)
library(iml)


#### UI


ui <- fluidPage(
    sidebarLayout(
        sidebarPanel(
            fileInput("file1", "Choose CSV File",
                      accept = c(
                          "text/csv",
                          "text/comma-separated-values,text/plain",
                          ".csv")
            ),
            tags$hr(),
            checkboxInput("header", "Header", TRUE)
        ),
        mainPanel(
            #tableOutput("contents"),
            plotOutput("plot1", click = "plot_brush")
        )
    )
)

server <- function(input, output) {
    # create mydata as a reactiveVal so that it can be edited everywhere
    mydata = reactiveVal()

    # reactive block is changed with an observe that allows mydata to be updated
    # on change of data
    observe({
        req(input$file1, input$header, file.exists(input$file1$datapath))
        data = read.csv(input$file1$datapath, header = input$header)
        mydata(data)
    })


    output$contents <- renderTable({
        req(mydata())
        #mydata()
    })


    ### test
    xgb_trcontrol = trainControl(
        method = "cv",
        number = 5,
        allowParallel = TRUE,
        verboseIter = FALSE,
        returnData = FALSE
    )


    #I am specifing the same parameters with the same values as I did for Python above. The hyperparameters to optimize are found in the website.
    xgbGrid <- expand.grid(nrounds = c(10,14),  # this is n_estimators in the python code above
                           max_depth = c(10, 15, 20, 25),
                           colsample_bytree = seq(0.5, 0.9, length.out = 5),
                           ## The values below are default values in the sklearn-api.
                           eta = 0.1,
                           gamma=0,
                           min_child_weight = 1,
                           subsample = 1
    )



    # note that obsB reactive variable is gone. if you don't use a 
    # reactive variable, the code block will not be executed.
    # unlike observe blocks, reactive blocks are lazy and should
    # not be relied on for their side effects
    observe({
        # this if ensures you don't run this block before mydata isn't a data frame
        # also prevents it running after it updates mydata. otherwise this will
        # be executed twice. its an invisible problem that'll make it run half
        # as fast unless you debug.
        if ('data.frame' %in% class(mydata()) & !'predicted' %in% names(mydata())){
            set.seed(0)
            xgb_model = train(
                mydata()["LotArea"], as.vector(t(mydata()["SalePrice"])),
                trControl = xgb_trcontrol,
                tuneGrid = xgbGrid,
                method = "xgbTree"
            )

            predicted = predict(xgb_model, mydata()["LotArea"])
            data = mydata()
            data["predicted"] = predicted
            mydata(data)
        }





    })

    output$plot1 <- renderPlot({
        data = mydata()
        # this is here to prevent premature triggering of this ggplot.
        # otherwise you'll get the "object not found" error
        if('predicted' %in% names(data)){
            ggplot(mydata(), aes(x=predicted, y=SalePrice)) + geom_point()
        }
    })




}

shinyApp(ui, server)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM