简体   繁体   中英

R programming ggvis histogram verses hist - How to size the buckets, and define X axis spacing (ticks)

I am learning to use ggvis and wanted to understand how to create the equivalent histogram to that produced by hist. Specifically, how do you set the bin widths and upper and lower bounds of x in ggvis histograms? What am I missing?

Question: How do I get the ggvis histogram output to match the hist output?

Let me provide an example:

require(psych)
require(RCurl)
require(ggvis)

if ( !exists("impact") ) {
  url <- "https://dl.dropboxusercontent.com/u/8272421/stat/stat_one.txt"
  myCsv <- getURL(url, ssl.verifypeer = FALSE)
  impact <- read.csv(textConnection(myCsv), sep = "\t")
  impact$subject <- factor(impact$subject)
}

describe(impact)

hist(impact$verbal_memory_baseline, 
     main = "Distribution of verbal memory baseline scores", 
     xlab = "score", ylab = "frequency")

历史输出示例

Ok, lets try and reproduce with ggvis... the output does not match...

impact %>%
ggvis( x = ~verbal_memory_baseline, fill := "white") %>%
layer_histograms(width = 5) %>%
add_axis("x", title = "score") %>%
add_axis("y", title = "frequency")

ggvis直方图输出

How do I get the ggvis output to match the hist output?


> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.2 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] psych_1.5.6      knitr_1.11       ggvis_0.4.2.9000 setwidth_1.0-4  colorout_1.1-1   vimcom_1.2-3    

loaded via a namespace (and not attached):
[1] Rcpp_0.12.0          digest_0.6.8         dplyr_0.4.3.9000     assertthat_0.1       mime_0.3            
[6] R6_2.1.1             jsonlite_0.9.16      xtable_1.7-4         DBI_0.3.1            magrittr_1.5        
[11] lazyeval_0.1.10.9000 rstudioapi_0.3.1     rmarkdown_0.7        tools_3.2.2          shiny_0.12.2        
[16] httpuv_1.3.3         yaml_2.1.13          parallel_3.2.2       rsconnect_0.4.1.4    mnormt_1.5-3        
[21] htmltools_0.2.6

Try

impact %>%
  ggvis( x = ~verbal_memory_baseline, fill := "white") %>%
  layer_histograms(width = 5, boundary = 5) %>% 
  add_axis("y", title = "frequency") %>%
  add_axis("x", title = "score", ticks = 5)

Which gives:

在此处输入图片说明


The official documentation is a bit cryptic about how boundary and center works. Have a look at DataCamp's How to Make a Histogram with ggvis in R

The width argument already set the bin width to 5, but where do bins start and where do they end? You can use the center or boundary argument for this. center should refer to one of the bins' center value, which automatically determines the other bins location. The boundary argument specifies the boundary value of one of the bins. Here again, specifying a single value fixes the location of all bins. As these two arguments specify the same thing in a different way, you should set at most one of center or boundary .


If you want the same result using center instead of boundary try:

impact %>%
  ggvis( x = ~verbal_memory_baseline, fill := "white") %>%
  layer_histograms(width = 5, center = 77.5) %>% 
  add_axis("y", title = "frequency") %>%
  add_axis("x", title = "score", ticks = 5)

Here you specify the center of a bin (77.5) and it determines all the others automatically

Stevens answer is correct.

Having his pointers allowed me to read the documentation much more deeply:

layer_histograms():

http://www.rdocumentation.org/packages/ggvis/functions/layer_histograms

Boundary

  • A boundary between two bins. As with center, things are shifted when boundary is outside the range of the data. For example, to center on integers, use width = 1 and boundary = 0.5, even if 1 is outside the range of the data. At most one of center and boundary may be specified.

add_axis()

http://www.rdocumentation.org/packages/ggvis/functions/add_axis

ticks

  • A desired number of ticks. The resulting number may be different so that values are "nice" (multiples of 2, 5, 10) and lie within the underlying scale's range.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM