簡體   English   中英

在閃亮的應用程序上向DataFrame添加新列

[英]Add new columns to a DataFrame on shiny apps

我正在使用有光澤的應用程序在R上構建一個簡單的ML模型,該應用程序的結構為:

1)從本地文件加載數據2)使用加載的數據訓練模型3)繪制結果

我的問題是在第2階段 ,我可以使用以下代碼繪制輸入數據:

  output$plot1 <- renderPlot({
    ggplot(mydata(), aes(x=LotArea, y=SalePrice)) + geom_point()
  })

但是由於預測值不在原始DF中,因此我需要先添加它們。

我為此使用的代碼是:

  obsB <- reactive({


    set.seed(0) 
    xgb_model = train(
      mydata()["LotArea"], as.vector(t(mydata()["SalePrice"])),  
      trControl = xgb_trcontrol,
      tuneGrid = xgbGrid,
      method = "xgbTree"
    )

    predicted = predict(xgb_model, mydata()["LotArea"])
    mydata()["predicted"] = predicted


  })

這是我得到的錯誤:

Warning: Error in FUN: object 'predicted' not found

當我將“ LotArea”更改為“ predicted”時,就會發生這種情況

  output$plot1 <- renderPlot({
    ggplot(mydata(), aes(x=predicted, y=SalePrice)) + geom_point()
  })

這是我的完整代碼:

library(shiny)
library(readxl)
library(tidyverse)
library(xgboost)
library(caret)
library(iml)


#### UI


ui <- fluidPage(
  sidebarLayout(
    sidebarPanel(
      fileInput("file1", "Choose CSV File",
                accept = c(
                  "text/csv",
                  "text/comma-separated-values,text/plain",
                  ".csv")
      ),
      tags$hr(),
      checkboxInput("header", "Header", TRUE)
    ),
    mainPanel(
      #tableOutput("contents"),
      plotOutput("plot1", click = "plot_brush")
    )
  )
)

server <- function(input, output) {
  mydata <- reactive({
    req(input$file1, input$header, file.exists(input$file1$datapath))
    read.csv(input$file1$datapath, header = input$header)
  })


  output$contents <- renderTable({
    req(mydata())
    #mydata()
  })


  ### test
  xgb_trcontrol = trainControl(
    method = "cv",
    number = 5,  
    allowParallel = TRUE,
    verboseIter = FALSE,
    returnData = FALSE
  )


  #I am specifing the same parameters with the same values as I did for Python above. The hyperparameters to optimize are found in the website.
  xgbGrid <- expand.grid(nrounds = c(10,14),  # this is n_estimators in the python code above
                         max_depth = c(10, 15, 20, 25),
                         colsample_bytree = seq(0.5, 0.9, length.out = 5),
                         ## The values below are default values in the sklearn-api. 
                         eta = 0.1,
                         gamma=0,
                         min_child_weight = 1,
                         subsample = 1
  )




  obsB <- reactive({



    set.seed(0) 
    xgb_model = train(
      mydata()["LotArea"], as.vector(t(mydata()["SalePrice"])),  
      trControl = xgb_trcontrol,
      tuneGrid = xgbGrid,
      method = "xgbTree"
    )

    predicted = predict(xgb_model, mydata()["LotArea"])
    mydata()["predicted"] = predicted


  })

  output$plot1 <- renderPlot({
    ggplot(mydata(), aes(x=predicted, y=SalePrice)) + geom_point()
  })




  }

shinyApp(ui, server)

編輯:

我變了:

mydata()["predicted"] = predicted

對於:

data =  mydata()
data["predicted"] = predicted

但是知道我得到了另一個錯誤:

Warning: Error in : You're passing a function as global data.
Have you misspelled the `data` argument in `ggplot()

編輯2:這是我正在使用的數據的示例:

https://drive.google.com/file/d/1R8GA0fW0pOgG8Cpykc8mAThvKOCRCVl0/view?usp=sharing

您不能使用該語法更新反應式值。

您的問題是:

  • 如果使用value = reactive({...})創建無功值,則無法在該代碼塊之外更改其值
  • 如果要在代碼中的多個位置更改反應性元素的值,則需要使用reactiveValreactiveValues函數來創建變量。 可以使用variableName(newValue)語法更改以此方式創建的variableName(newValue)

例如

# we are somewhere inside the server code
mydata = reactiveVal()

observe({
    req(input$file1, input$header, file.exists(input$file1$datapath))
    data = read.csv(input$file1$datapath, header = input$header)
    mydata(data)
})

obsB <- reactive({


    set.seed(0) 
    xgb_model = train(
      mydata()["LotArea"], as.vector(t(mydata()["SalePrice"])),  
      trControl = xgb_trcontrol,
      tuneGrid = xgbGrid,
      method = "xgbTree"
    )

    predicted = predict(xgb_model, mydata()["LotArea"])
    newData = mydata()
    newData['predicted'] = predicted
    mydata(newData)
  })
  • 否則,您需要在單個代碼塊中組合所有更改mydata

請注意,我懷疑上面的代碼中可能存在循環,這是由於您在依賴於mydata的代碼塊中更新了mydata所致。 因為我沒有樣本數據,所以無法測試它,但是您可能不得不進行isolate或者使用另一個不是mydata觸發器來使其正常工作(例如, mydata觸發器)

附帶說明一下,如果您的應用程序需要數據才能運行,那么提供示例數據對我們更好。 我無法對此進行測試,因為無法輕易猜測輸入的外觀。 另外,最好使用專門為該問題編寫的新代碼來隔離問題,而不是將實際項目粘貼到此處,因為這樣可以消除數據和程序包的依賴性,並且不會產生與問題無關的干擾

由於您的代碼存在其他與問題無關的問題,因此這里有一個帶注釋的固定版本

library(shiny)
library(readxl)
library(tidyverse)
library(xgboost)
library(caret)
library(iml)


#### UI


ui <- fluidPage(
    sidebarLayout(
        sidebarPanel(
            fileInput("file1", "Choose CSV File",
                      accept = c(
                          "text/csv",
                          "text/comma-separated-values,text/plain",
                          ".csv")
            ),
            tags$hr(),
            checkboxInput("header", "Header", TRUE)
        ),
        mainPanel(
            #tableOutput("contents"),
            plotOutput("plot1", click = "plot_brush")
        )
    )
)

server <- function(input, output) {
    # create mydata as a reactiveVal so that it can be edited everywhere
    mydata = reactiveVal()

    # reactive block is changed with an observe that allows mydata to be updated
    # on change of data
    observe({
        req(input$file1, input$header, file.exists(input$file1$datapath))
        data = read.csv(input$file1$datapath, header = input$header)
        mydata(data)
    })


    output$contents <- renderTable({
        req(mydata())
        #mydata()
    })


    ### test
    xgb_trcontrol = trainControl(
        method = "cv",
        number = 5,
        allowParallel = TRUE,
        verboseIter = FALSE,
        returnData = FALSE
    )


    #I am specifing the same parameters with the same values as I did for Python above. The hyperparameters to optimize are found in the website.
    xgbGrid <- expand.grid(nrounds = c(10,14),  # this is n_estimators in the python code above
                           max_depth = c(10, 15, 20, 25),
                           colsample_bytree = seq(0.5, 0.9, length.out = 5),
                           ## The values below are default values in the sklearn-api.
                           eta = 0.1,
                           gamma=0,
                           min_child_weight = 1,
                           subsample = 1
    )



    # note that obsB reactive variable is gone. if you don't use a 
    # reactive variable, the code block will not be executed.
    # unlike observe blocks, reactive blocks are lazy and should
    # not be relied on for their side effects
    observe({
        # this if ensures you don't run this block before mydata isn't a data frame
        # also prevents it running after it updates mydata. otherwise this will
        # be executed twice. its an invisible problem that'll make it run half
        # as fast unless you debug.
        if ('data.frame' %in% class(mydata()) & !'predicted' %in% names(mydata())){
            set.seed(0)
            xgb_model = train(
                mydata()["LotArea"], as.vector(t(mydata()["SalePrice"])),
                trControl = xgb_trcontrol,
                tuneGrid = xgbGrid,
                method = "xgbTree"
            )

            predicted = predict(xgb_model, mydata()["LotArea"])
            data = mydata()
            data["predicted"] = predicted
            mydata(data)
        }





    })

    output$plot1 <- renderPlot({
        data = mydata()
        # this is here to prevent premature triggering of this ggplot.
        # otherwise you'll get the "object not found" error
        if('predicted' %in% names(data)){
            ggplot(mydata(), aes(x=predicted, y=SalePrice)) + geom_point()
        }
    })




}

shinyApp(ui, server)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM