簡體   English   中英

R Shiny:創建因子變量並定義水平

[英]R Shiny: Creating factor variables and defining levels

我試圖用 Shiny 創建一個機器學習應用程序。

在這個應用程序中,用戶可以選擇輸入變量的規格(通過輸入小部件),這些參數將用於給出響應變量的估計。 為此,我從選定的輸入創建了一個數據框,並將其保存在本地作為數據表。

當我再次加載數據表時出現問題,因為這將導致所有分類變量成為字符變量。 但是,我可以使用factor()函數手動更改這些並使用levels=規范。

問題是我不想每次使用新數據集時都手動執行此操作,因為這很可能會更改數據集中分類變量的位置。 新數據集中的分類變量也很可能不會相同。

數據框"DATA"是主數據集,其中包含第 1 列中的響應變量。

數據框"test"是根據所選輸入構建的數據框,它將用作預測的測試集,並且僅包含 1 個指定的觀察。 由於數據幀的構造方式,此數據幀將始終將響應變量作為數據幀中的最后一列。 因此DATA[ ,5]的因子變量將始終對應於測試數據幀中的前一列: test[ ,4]

這是需要指定因子水平的測試數據框,因為當它只包含 1 個觀察時,它不會自動知道類別的數量。

test[4] <- factor(test[4], levels = unique(DATA[,5]))

我正在嘗試編寫一個代碼,該代碼將對數據集中的所有字符變量應用factor函數並指定levels而不管字符變量在數據集中的位置。

這是我到目前為止編寫的代碼:

library(shiny)
library(tidyverse)
library(shinythemes)
library(data.table)
library(RCurl)
library(randomForest)
library(mlbench)
library(janitor)


# Read data
DATA <- BostonHousing

# Rearrange data so the response variable is located in column 1
DATA <- DATA[,c(names(BostonHousing)[14],names(BostonHousing)[-14])]

# Creating a simple RF model
model <- randomForest(medv ~ ., data = DATA, ntree = 500, mtry = 4, importance = TRUE)


# UI -------------------------------------------------------------------------
ui <- fluidPage(
  
  sidebarPanel(
    
    h3("Parameters Selected"),
    br(),
    tableOutput('show_inputs'),
    hr(),
    actionButton("submitbutton", label = "calculate", class = "btn btn-primary", icon("calculator")),
    hr(),
    tableOutput("tabledata")

    
  ), # End sidebarPanel
  
  mainPanel(
    
    h3("Input widgets"),
    uiOutput("select")
    
  ) # End mainPanel
  
) # End UI bracket



# Server -------------------------------------------------------------------------
server <- function(input, output, session) {
  
# Create input widgets from dataset  
  output$select <- renderUI({
    df <- req(DATA)
    tagList(map(
      names(df[-1]),
      ~ ifelse(is.numeric(df[[.]]),
               yes = tagList(sliderInput(
                 inputId = paste0(.),
                 label = .,
                 value = mean(df[[.]], na.rm = TRUE),
                 min = round(min(df[[.]], na.rm = TRUE),2),
                 max = round(max(df[[.]], na.rm = TRUE),2)
               )),
               no = tagList(selectInput(
                 inputId = paste0(.),
                 label = .,
                 choices = sort(unique(df[[.]])),
                 selected = sort(unique(df[[.]]))[1],
               ))
      )
    ))
  })
  

# creating dataframe of selected values to be displayed
  AllInputs <- reactive({
    id_exclude <- c("savebutton","submitbutton")
    id_include <- setdiff(names(input), id_exclude)
    
    if (length(id_include) > 0) {
      myvalues <- NULL
      for(i in id_include) {
        myvalues <- as.data.frame(rbind(myvalues, cbind(i, input[[i]])))
        
      }
      names(myvalues) <- c("Variable", "Selected Value")
      myvalues %>% 
        slice(match(names(DATA[,-1]), Variable))
    }
  })

  
# render table of selected values to be displayed
  output$show_inputs <- renderTable({
    AllInputs()
  })
  
 
# Creating a dataframe for calculating a prediction
  datasetInput <- reactive({  
    
    df1 <- data.frame(AllInputs(), stringsAsFactors = FALSE)
    input <- transpose(rbind(df1, names(DATA[1])))

    write.table(input,"input.csv", sep=",", quote = FALSE, row.names = FALSE, col.names = FALSE)
    test <- read.csv(paste("input.csv", sep=""), header = TRUE)
    
    
# defining factor levels for factor variables
    test[4] <- factor(test[4], levels = unique(DATA[,5])) # <- This line will cause problems if multiple factors in dataset or if different column location
   

# Making the actual prediction and store it in a data.frame     
    Prediction <- predict(model,test)
    Output <- data.frame("Prediction"=Prediction)
    print(format(Output, nsmall=2, big.mark=","))
  })
  

# display the prediction when the submit button is pressed
  output$tabledata <- renderTable({
    if (input$submitbutton>0) { 
      isolate(datasetInput()) 
    } 
  })
  
  
} # End server bracket



# ShinyApp -------------------------------------------------------------------------
shinyApp(ui, server)

要概括因子變量,您可以使用以下代碼:

# defining factor levels for factor variables
#test[4] <- factor(test[4], levels = unique(DATA[,5])) # <- This line will cause problems if multiple factors in dataset or if different column location

cnames <- colnames(DATA[sapply(DATA,class)=="factor"])
if (length(cnames)>0){
  lapply(cnames, function(par) {
    test[par] <<- factor(test[par], levels = unique(DATA[,par]))
  })
}

您可以將其應用於 BostonHousing2 數據,如下所示

# Read data
BH <- BostonHousing2
DATA <- BH

# Rearrange data so the response variable is located in column 1
#DATA <- DATA[,c(names(BH)[14],names(BH)[-14])]
DATA <- DATA[,c(names(BH)[5],names(BH)[-5])]   ## for BostonHousing2

# Creating a simple RF model
model <- randomForest(medv ~ ., data = DATA[,-2], ntree = 500, mtry = 4, importance = TRUE)


# UI -------------------------------------------------------------------------
ui <- fluidPage(
  
  sidebarPanel(
    
    h3("Parameters Selected"),
    br(),
    tableOutput('show_inputs'),
    hr(),
    actionButton("submitbutton", label = "calculate", class = "btn btn-primary", icon("calculator")),
    hr(),
    tableOutput("tabledata")
    
  ), # End sidebarPanel
  
  mainPanel(
    
    h3("Input widgets"),
    uiOutput("select")
    
  ) # End mainPanel
  
) # End UI bracket


# Server -------------------------------------------------------------------------
server <- function(input, output, session) {
  
  # Create input widgets from dataset  
  output$select <- renderUI({
    df <- req(DATA)
    tagList(map(
      names(df[-1]),
      ~ ifelse(is.numeric(df[[.]]),
               yes = tagList(sliderInput(
                 inputId = paste0(.),
                 label = .,
                 value = mean(df[[.]], na.rm = TRUE),
                 min = round(min(df[[.]], na.rm = TRUE),2),
                 max = round(max(df[[.]], na.rm = TRUE),2)
               )),
               no = tagList(selectInput(
                 inputId = paste0(.),
                 label = .,
                 choices = sort(unique(df[[.]])),
                 selected = sort(unique(df[[.]]))[1],
               ))
      )
    ))
  })
  
  
  # creating dataframe of selected values to be displayed
  AllInputs <- reactive({
    id_exclude <- c("savebutton","submitbutton")
    id_include <- setdiff(names(input), id_exclude)
    
    if (length(id_include) > 0) {
      myvalues <- NULL
      for(i in id_include) {
        myvalues <- as.data.frame(rbind(myvalues, cbind(i, input[[i]])))
        
      }
      names(myvalues) <- c("Variable", "Selected Value")
      myvalues %>% 
        slice(match(names(DATA[,-1]), Variable))
    }
  })
  
  
  # render table of selected values to be displayed
  output$show_inputs <- renderTable({
    AllInputs()
  })
  
  
  # Creating a dataframe for calculating a prediction
  datasetInput <- reactive({  
    
    df1 <- data.frame(AllInputs(), stringsAsFactors = FALSE)
    input <- transpose(rbind(df1, names(DATA[1])))
    
    write.table(input,"input.csv", sep=",", quote = FALSE, row.names = FALSE, col.names = FALSE)
    test <- read.csv(paste("input.csv", sep=""), header = TRUE)
    
    
    # defining factor levels for factor variables
    #test[4] <- factor(test[4], levels = unique(DATA[,5])) # <- This line will cause problems if multiple factors in dataset or if different column location

    cnames <- colnames(DATA[sapply(DATA,class)=="factor"])
    if (length(cnames)>0){
      lapply(cnames, function(par) {
        test[par] <<- factor(test[par], levels = unique(DATA[,par]))
      })
    }
    
    # Making the actual prediction and store it in a data.frame     
    Prediction <- predict(model,test)
    Output <- data.frame("Prediction"=Prediction)
    print(format(Output, nsmall=2, big.mark=","))
  })
  
  # display the prediction when the submit button is pressed
  output$tabledata <- renderTable({
    if (input$submitbutton>0) { 
      isolate(datasetInput()) 
    } 
  })
  
} # End server bracket

# ShinyApp -------------------------------------------------------------------------
shinyApp(ui, server)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM