简体   繁体   中英

Time-series histogram

Is it possible to create a time-series histogram like the one described in this presentation (slides 36-39) using either R or D3.js? Or is there a better way to show bucketed data as a time series?

Edit: Here is some pre-bucketed sample data . Ideally, D3 or R would do the bucketing by itself. And yes, if it wasn't clear, I understand that I could write this myself. I'm just wondering if there's already a package that does this and I just haven't come across it yet. Thanks!

Here's a version in D3, modeled after @bdemarest's answer using ggplot2:

D3热图

This version uses tiled rect elements . If you have a large dataset, you might get better performance from a pixel-based heatmap .

If you want to compute the buckets using D3, you can use d3.nest to group the data by day and by value. There's also d3.layout.histogram , but since you presumably want uniformly-spaced bins and the same bins for every day, d3.nest should be sufficient.

One subtle consideration: I placed the tick marks on the scale in-between tiles so as to indicate visually how the values are binned. For example, the bottom-left bucket corresponds to all values between 800 and 900 on July 20 (where July 20 is the midnight-to-midnight interval); at least, that's what I assumed from looking at your data. This is slightly clearer than labeling the middle of the rect because it indicates that the values are floored rather than rounded.

Here is one possible solution using R and ggplot2.

Your data, ready to paste into R console:

dat = structure(list(date = structure(c(15541, 15541, 15541, 15541, 
    15541, 15541, 15541, 15541, 15541, 15541, 15541, 15541, 15541, 
    15541, 15541, 15541, 15541, 15542, 15542, 15542, 15542, 15542, 
    15542, 15542, 15542, 15542, 15542, 15542, 15542, 15542, 15542, 
    15542, 15543, 15543, 15543, 15543, 15543, 15543, 15543, 15543, 
    15543, 15543, 15543, 15543, 15543, 15543, 15543, 15543, 15543, 
    15543, 15543, 15544, 15544, 15544, 15544, 15544, 15544, 15544, 
    15544, 15544, 15544, 15544, 15544, 15544, 15544, 15544, 15544, 
    15544, 15544, 15544, 15544, 15544, 15545, 15545, 15545, 15545, 
    15545, 15545, 15545, 15545, 15545, 15545, 15545, 15545, 15545, 
    15545, 15545, 15545, 15545, 15546, 15546, 15546, 15546, 15546, 
    15546, 15546, 15546, 15546, 15546, 15546, 15546, 15546, 15546, 
    15546, 15546, 15546, 15547, 15547, 15547, 15547, 15547, 15547, 
    15547, 15547, 15547, 15547, 15547, 15547, 15547, 15547, 15547, 
    15547, 15547, 15547, 15547), class = "Date"), bucket = c(800L, 
    900L, 1000L, 1100L, 1200L, 1300L, 1400L, 1500L, 1600L, 1700L, 
    1800L, 1900L, 2000L, 2100L, 2200L, 2300L, 2400L, 800L, 900L, 
    1000L, 1100L, 1200L, 1300L, 1400L, 1500L, 1600L, 1700L, 1800L, 
    1900L, 2000L, 2100L, 2200L, 900L, 1000L, 1100L, 1200L, 1300L, 
    1400L, 1500L, 1600L, 1700L, 1800L, 1900L, 2000L, 2100L, 2200L, 
    2300L, 2400L, 2500L, 2600L, 2800L, 800L, 900L, 1000L, 1100L, 
    1200L, 1300L, 1400L, 1500L, 1600L, 1700L, 1800L, 1900L, 2000L, 
    2100L, 2200L, 2300L, 2400L, 2500L, 2600L, 2700L, 2800L, 800L, 
    900L, 1000L, 1100L, 1200L, 1300L, 1400L, 1500L, 1600L, 1700L, 
    1800L, 1900L, 2000L, 2100L, 2200L, 2300L, 2400L, 800L, 900L, 
    1000L, 1100L, 1200L, 1300L, 1400L, 1500L, 1600L, 1700L, 1800L, 
    1900L, 2000L, 2100L, 2200L, 2300L, 2400L, 1300L, 1400L, 1500L, 
    1600L, 1700L, 1800L, 1900L, 2000L, 2100L, 2200L, 2300L, 2400L, 
    2500L, 2600L, 2700L, 2800L, 2900L, 3000L, 3200L), cnt = c(119L, 
    123L, 173L, 226L, 284L, 257L, 268L, 244L, 191L, 204L, 187L, 177L, 
    164L, 125L, 140L, 109L, 103L, 123L, 165L, 237L, 278L, 338L, 306L, 
    316L, 269L, 271L, 241L, 188L, 174L, 158L, 153L, 132L, 154L, 241L, 
    246L, 300L, 305L, 301L, 292L, 253L, 251L, 214L, 189L, 179L, 159L, 
    161L, 144L, 139L, 132L, 136L, 105L, 120L, 156L, 209L, 267L, 299L, 
    316L, 318L, 307L, 295L, 273L, 283L, 229L, 192L, 193L, 170L, 164L, 
    154L, 138L, 101L, 115L, 103L, 105L, 156L, 220L, 255L, 308L, 338L, 
    318L, 255L, 278L, 260L, 235L, 230L, 185L, 145L, 147L, 157L, 109L, 
    104L, 191L, 201L, 238L, 223L, 229L, 286L, 256L, 240L, 233L, 202L, 
    180L, 184L, 161L, 125L, 110L, 101L, 132L, 117L, 124L, 154L, 167L, 
    137L, 169L, 175L, 168L, 188L, 137L, 173L, 164L, 167L, 115L, 116L, 
    118L, 125L, 104L)), .Names = c("date", "bucket", "cnt"), 
    class = "data.frame", row.names = c(NA, -125L))

Plotting code:

library(ggplot2)

plot_1 = ggplot(dat, aes(x=date, y=bucket, fill=cnt)) +
         geom_tile() +
         scale_fill_continuous(low="#F7FBFF", high="#2171B5") +
         theme_bw()

ggsave("plot_1.png", plot_1, width=6, height=4)

在此输入图像描述 The plot might look better if you include rows for zero bucket values in your data. Then you could change low="#F7FBFF" to low="white" .

Put your numbers in a matrix and use 'image(mat)'? That looks to be all it is. A grid. A raster. Or am I missing something?

There's also ways to do this with ggplot, raster, and probably others.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM