简体   繁体   中英

Add new columns to a DataFrame on shiny apps

I'm working on a simple ML model on R using shiny apps, the structure of the app will be:

1) Load the data from a local file 2) Train a model with the loaded data 3) Plot the results

The my problem is at stage 2 , I'm able to plot the input data with this code:

  output$plot1 <- renderPlot({
    ggplot(mydata(), aes(x=LotArea, y=SalePrice)) + geom_point()
  })

But since the predicted values are not in the original DF, I need to add them first.

The code I'm using for that is:

  obsB <- reactive({


    set.seed(0) 
    xgb_model = train(
      mydata()["LotArea"], as.vector(t(mydata()["SalePrice"])),  
      trControl = xgb_trcontrol,
      tuneGrid = xgbGrid,
      method = "xgbTree"
    )

    predicted = predict(xgb_model, mydata()["LotArea"])
    mydata()["predicted"] = predicted


  })

This is the error I'm getting:

Warning: Error in FUN: object 'predicted' not found

This happends when I change "LotArea" for "predicted"

  output$plot1 <- renderPlot({
    ggplot(mydata(), aes(x=predicted, y=SalePrice)) + geom_point()
  })

This is the complete code I have:

library(shiny)
library(readxl)
library(tidyverse)
library(xgboost)
library(caret)
library(iml)


#### UI


ui <- fluidPage(
  sidebarLayout(
    sidebarPanel(
      fileInput("file1", "Choose CSV File",
                accept = c(
                  "text/csv",
                  "text/comma-separated-values,text/plain",
                  ".csv")
      ),
      tags$hr(),
      checkboxInput("header", "Header", TRUE)
    ),
    mainPanel(
      #tableOutput("contents"),
      plotOutput("plot1", click = "plot_brush")
    )
  )
)

server <- function(input, output) {
  mydata <- reactive({
    req(input$file1, input$header, file.exists(input$file1$datapath))
    read.csv(input$file1$datapath, header = input$header)
  })


  output$contents <- renderTable({
    req(mydata())
    #mydata()
  })


  ### test
  xgb_trcontrol = trainControl(
    method = "cv",
    number = 5,  
    allowParallel = TRUE,
    verboseIter = FALSE,
    returnData = FALSE
  )


  #I am specifing the same parameters with the same values as I did for Python above. The hyperparameters to optimize are found in the website.
  xgbGrid <- expand.grid(nrounds = c(10,14),  # this is n_estimators in the python code above
                         max_depth = c(10, 15, 20, 25),
                         colsample_bytree = seq(0.5, 0.9, length.out = 5),
                         ## The values below are default values in the sklearn-api. 
                         eta = 0.1,
                         gamma=0,
                         min_child_weight = 1,
                         subsample = 1
  )




  obsB <- reactive({



    set.seed(0) 
    xgb_model = train(
      mydata()["LotArea"], as.vector(t(mydata()["SalePrice"])),  
      trControl = xgb_trcontrol,
      tuneGrid = xgbGrid,
      method = "xgbTree"
    )

    predicted = predict(xgb_model, mydata()["LotArea"])
    mydata()["predicted"] = predicted


  })

  output$plot1 <- renderPlot({
    ggplot(mydata(), aes(x=predicted, y=SalePrice)) + geom_point()
  })




  }

shinyApp(ui, server)

EDIT:

I changed:

mydata()["predicted"] = predicted

for:

data =  mydata()
data["predicted"] = predicted

But Know I get a different error:

Warning: Error in : You're passing a function as global data.
Have you misspelled the `data` argument in `ggplot()

EDIT 2: This is a sample of the data I'm using:

https://drive.google.com/file/d/1R8GA0fW0pOgG8Cpykc8mAThvKOCRCVl0/view?usp=sharing

You cannot update a reactive value using that syntax.

Your problems are:

  • If you create a reactive value using the value = reactive({...}) you cannot change its value outside that code block
  • If you want to be able to change the value of a reactive element in more than one place within the code you need to use reactiveVal or reactiveValues functions to create the variables. Variables created in this manner can be changed using the variableName(newValue) syntax.

For instance

# we are somewhere inside the server code
mydata = reactiveVal()

observe({
    req(input$file1, input$header, file.exists(input$file1$datapath))
    data = read.csv(input$file1$datapath, header = input$header)
    mydata(data)
})

obsB <- reactive({


    set.seed(0) 
    xgb_model = train(
      mydata()["LotArea"], as.vector(t(mydata()["SalePrice"])),  
      trControl = xgb_trcontrol,
      tuneGrid = xgbGrid,
      method = "xgbTree"
    )

    predicted = predict(xgb_model, mydata()["LotArea"])
    newData = mydata()
    newData['predicted'] = predicted
    mydata(newData)
  })
  • Otherwise you need to combine everything that changes mydata in that single code block.

Note that I suspect in the code above there may be a loop due to you updating mydata in a code block that depends on mydata . Couldn't test it since I don't have sample data but you may have to play around with isolate or have another trigger that is not mydata to make it work (eg. triggers of the mydata )

As a side note, if your app requires data to function, it is better for us if you provide a sample data. I can't test this as I can't easily guess what the input should look like. Also it is often better to use new code specifically written for the question that isolates the issue rather than paste your actual project here since you can get rid of the data and package dependencies and there won't be distractions that are not related to the problem

Since your code other issues that is not related to the question, here's an annotated and fixed version

library(shiny)
library(readxl)
library(tidyverse)
library(xgboost)
library(caret)
library(iml)


#### UI


ui <- fluidPage(
    sidebarLayout(
        sidebarPanel(
            fileInput("file1", "Choose CSV File",
                      accept = c(
                          "text/csv",
                          "text/comma-separated-values,text/plain",
                          ".csv")
            ),
            tags$hr(),
            checkboxInput("header", "Header", TRUE)
        ),
        mainPanel(
            #tableOutput("contents"),
            plotOutput("plot1", click = "plot_brush")
        )
    )
)

server <- function(input, output) {
    # create mydata as a reactiveVal so that it can be edited everywhere
    mydata = reactiveVal()

    # reactive block is changed with an observe that allows mydata to be updated
    # on change of data
    observe({
        req(input$file1, input$header, file.exists(input$file1$datapath))
        data = read.csv(input$file1$datapath, header = input$header)
        mydata(data)
    })


    output$contents <- renderTable({
        req(mydata())
        #mydata()
    })


    ### test
    xgb_trcontrol = trainControl(
        method = "cv",
        number = 5,
        allowParallel = TRUE,
        verboseIter = FALSE,
        returnData = FALSE
    )


    #I am specifing the same parameters with the same values as I did for Python above. The hyperparameters to optimize are found in the website.
    xgbGrid <- expand.grid(nrounds = c(10,14),  # this is n_estimators in the python code above
                           max_depth = c(10, 15, 20, 25),
                           colsample_bytree = seq(0.5, 0.9, length.out = 5),
                           ## The values below are default values in the sklearn-api.
                           eta = 0.1,
                           gamma=0,
                           min_child_weight = 1,
                           subsample = 1
    )



    # note that obsB reactive variable is gone. if you don't use a 
    # reactive variable, the code block will not be executed.
    # unlike observe blocks, reactive blocks are lazy and should
    # not be relied on for their side effects
    observe({
        # this if ensures you don't run this block before mydata isn't a data frame
        # also prevents it running after it updates mydata. otherwise this will
        # be executed twice. its an invisible problem that'll make it run half
        # as fast unless you debug.
        if ('data.frame' %in% class(mydata()) & !'predicted' %in% names(mydata())){
            set.seed(0)
            xgb_model = train(
                mydata()["LotArea"], as.vector(t(mydata()["SalePrice"])),
                trControl = xgb_trcontrol,
                tuneGrid = xgbGrid,
                method = "xgbTree"
            )

            predicted = predict(xgb_model, mydata()["LotArea"])
            data = mydata()
            data["predicted"] = predicted
            mydata(data)
        }





    })

    output$plot1 <- renderPlot({
        data = mydata()
        # this is here to prevent premature triggering of this ggplot.
        # otherwise you'll get the "object not found" error
        if('predicted' %in% names(data)){
            ggplot(mydata(), aes(x=predicted, y=SalePrice)) + geom_point()
        }
    })




}

shinyApp(ui, server)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM