[英]Downsampling analytics data in MySQL or in R
I am storing analytics data in an MySQL database as a table with a timestamp and some data , and want to downsample (ie group it within a time range) this data (by counting the number of entries) for displaying on an admin console, and I was wondering if it would be more efficient to select the data and downsample it with an R script, or if it would be better to use 我将分析数据作为带有时间戳和一些数据的表存储在MySQL数据库中,并希望对此数据进行下采样(即,在一个时间范围内对其进行分组)(通过计算条目数)以显示在管理控制台上,以及我想知道使用R脚本选择数据并对其进行下采样会更有效,还是使用起来会更好
GROUP BY UNIX_TIMESTAMP(timestamp) DIV <some time>
and do it on the database layer. 并在数据库层上进行。 Any other tips would also be appreciated.
任何其他技巧也将不胜感激。
If you can use dplyr
, you could do it with something like the following: 如果可以使用
dplyr
,则可以执行以下操作:
library(dplyr)
yay <-
# Specify username and password in my.cnf
src_mysql(host = "blah.com") %>%
tbl("some_table") %>%
# You will need to compute a grouping variable
mutate(group = unix_timestamp(timestamp)) %>%
group_by(group) %>%
# This will return the number of rows in each group
summarise(n = n()) %>%
# This will execute the query and return a data.frame
collect
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.