简体   繁体   English

分割数据和生成季节性箱形图的最优雅方法是什么?

[英]What is the most elegant way to split data and produce seasonal boxplots?

I want to produce seasonal boxplots for a lot of different time series. 我想为很多不同的时间序列制作季节性的箱形图。 I hope that the code below clearly illustrates what I want to do. 我希望下面的代码清楚地说明了我想要做的事情。

My question is now, how to do this in the most elegant way with as few lines of code as possible. 我现在的问题是,如何以尽可能少的代码行以最优雅的方式完成此操作。 I can create an new object for each month with the function "subset" and then plot it, but this seems to be not very elegant. 我可以使用函数“subset”为每个月创建一个新对象,然后绘制它,但这似乎不是很优雅。 I tried to use the "split" function, but I don't know, how to proceed from there. 我试图使用“拆分”功能,但我不知道,如何从那里开始。

Please tell me if my question is not clearly stated or edit it to make it clearer. 请告诉我,如果我的问题没有明确说明或编辑,以使其更清楚。

Any direct help or linkage to other websites/posts is greatly appreciated. 非常感谢任何直接帮助或与其他网站/帖子的链接。 Thanks for your time. 谢谢你的时间。

Here is the code: 这是代码:

## Create Data
Time <- seq(as.Date("2003/8/6"), as.Date("2011/8/5"), by = "2 weeks")
data <- rnorm(209, mean = 15, sd = 1)
DF <- data.frame(Time = Time, Data = data)
DF[,3] <- as.numeric(format(DF$Time, "%m"))
colnames(DF)[3] <- "Month"

## Create subsets
Jan <- subset(DF, Month == 1)
Feb <- subset(DF, Month == 2)
Mar <- subset(DF, Month == 3)
Apr <- subset(DF, Month == 4)

## Create boxplot
months <- c("Jan", "Feb", "Mar", "Apr")
boxplot(Jan$Data, Feb$Data, Mar$Data, Apr$Data, ylab = "Data", xlab = "Months", names = months)

## Try with "split" function
DF.split <- split(DF, DF$Month)
head(DF.split)

Using 'ggplot2' (and @James' month names, thanks!): 使用'ggplot2'(以及@James的月份名称,谢谢!):

DF$month <- factor(strftime(DF$Time,"%b"),levels=month.abb)
ggplot(DF, aes(x=,month, y=Data)) +
    geom_boxplot()

箱形图

(BTW: note that in 'ggplot2' " The upper and lower "hinges" correspond to the first and third quartiles (the 25th and 7th percentiles). This differs slightly from the method used by the boxplot function, and may be apparent with small samples. " - see documentation ) (顺便说一句:请注意'ggplot2'“ 上部和下部”铰链“对应于第一和第三四分位数(第25和第7个百分位数)。这与boxplot函数使用的方法略有不同,并且可能很明显样品。 “ - 见文件

You are better off picking out the month names directly with the "%b" format and using an ordered factor and the formula interface for boxplot : 您最好直接使用"%b"格式选择月份名称,并使用有序因子和boxplot的公式界面:

DF$month <- factor(strftime(DF$Time,"%b"),levels=month.abb)
boxplot(Data~month,DF)

在此输入图像描述

To set months as ordered factor in any locale settings use a trick which can be found in help page for ?month.abb : 在任何区域设置中将月份设置为有序因子,请使用可在帮助页面中找到的技巧?month.abb

Sys.setlocale("LC_TIME", "German_Germany")
DF$month <- factor(format(DF$Time, "%b"), levels=format(ISOdate(2000, 1:12, 1), "%b"))

And you could plot it in lattice as well: 你也可以在lattice绘制它:

require(lattice)
bwplot(Data~month, DF, pch="|") # set pch to nice line instead of point

格子boxplot

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 用 R 计算季节性平均值的最优雅方法是什么? - What is the most elegant way to calculate seasonal means with R? 检查 R 中缺失数据模式的最优雅方法是什么? - what is the most elegant way to check for patterns of missing data in R? 将 function 应用于 data.table 或 data.frame 中的多对列的最优雅方法是什么? - What is the most elegant way to apply a function to multiple pairs of columns in a data.table or data.frame? 查找具有所有唯一值的data.frame第一列的最优雅方法是什么? - what is the most elegant way to find the first column of a data.frame that has all unique values? 将存储在矩阵中的 n 位数据转换为 integer 的最优雅方法是什么? - What is the most elegant way to convert n-bit data stored in a matrix to integer? 使用不纯的 function 遍历数据帧的行的最优雅的方法是什么? - What is most elegant way to loop through rows of a data frame with an impure function? R 中的季节性温度箱线图 - Seasonal Temperature Boxplots in R 在第一时间段内通过值标准化时间序列的最优雅方法是什么? - What is the most elegant way to standardize a time series by the value in the first period? R:在将所有元素粘贴到单个字符串之前,最优雅的方法来清理数据框 - R: Most elegant way to sanitize data frame before pasting all elements to single string 将数据拆分为多个重叠子组的最优雅的 tidyverse 样式方法是什么 - What is the most elegant tidyverse-style method for splitting data into multiple overlapping subgroups
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM