简体   繁体   中英

R Shiny - cache big dataframe

I'm quite new to Shiny, so my apologizes if my question is an easy one. I tried to check on google and stackoverflow but couldn't locate a simple and helpful answer so far. What's my goal/issue: I'm coding a Shiny page that displays a table with hundreds of thousands of rows. Data is sourced from different databases, manipulated, cleaned, and displayed to all the users upon request. Problem 1: in order to load all the data, the script takes almost 5minutes Problem 2: if at 8:00am user1 requests this data and at 8:05am user2 requests the same data, two different queries are launched and also two different spaces in memory are used to show exactly the same data to two different users. So the question is: shall I use a cache system to enhance this process? if not, what else shall I use? I found a lot of official Shiny documentation on caching plots but nothing related to caching data (and I found this quite surprising). Other useful information: data in cache should be deleted every evening around 10pm since new data will be available the next day / early morning.

Code:

ui <- dashboardPage(  # https://rstudio.github.io/shinydashboard/structure.html
    title = "Dashboard",  
    dashboardHeader(title = "Angelo's Board"),
    dashboardSidebar(   # inside here everything that is displayed on the left hand side
      includeCSS("www/styles.css"),    

      sidebarMenu(      

        menuItem('menu 1', tabName = "menu1", icon = icon("th"),
                 menuItem('Data 1', tabName = 'tab_data1'))

      )),


    dashboardBody( 

      tabItems(

        tabItem(tabName = 'tab_data1')),
      h3("Page with big table"),
      fluidRow(dataTableOutput("main_table"))
    ))


  server <- function(input, output, session) {

    output$main_tabl <- renderDataTable({ 
      df <- data.frame(names = c("Mark","George","Mary"), age = c(30,40,35))
    })

  }

  cat("\nLaunching   'shinyApp' ....")
  shinyApp(ui, server)

Resources I used to check for potential solution:

Any help would be much appreciated. Thanks

I would break out the bulk of your ETL processes into a separate R script and set that script to run on a cron. You can then have this script write out the processed dataframe(s) to a .feather file. Then have your shiny app load the feather file(s) - feather is optimized for reading so should be fast.

Example, take the necessary libraries and code out of your server.R (or app.R) file, and create a new R script called query.R. That script performs all the ETL operations and finally writes out your data to a .feather file (requires the feather package). Then create a crontab to run that script as often as needed.

Your server.R script then just needs to read in that feather file when the app loads and you should see a significant performance improvement. In addition, you have have the query.R script run during off hours so that performance on the linux box isn't negatively impacted.

Another option, put this DataFrame in global.R and change /etc/shiny-server/shiny-server.conf by adding «app_idle_timeout 0» after «location / {». This will disable application idle timeouts in Shiny Server, so global.R will be in RAM for all users.

To prevent first user from long data loading, you can put in cron «@reboot wget -O index.html localhost:3838» on your server, so on every reboot global.R will load to memory automatically.

Also, about pre-cache organisation you can read here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM