[英]How to categorize a vector in R to draw a pie chart
I want to categorize rivers dataset into “tiny” (<500), “short” (<1500), “medium” (<3000) and “long” (>=3000). 我想将河流数据集分为“小”(<500),“短”(<1500),“中”(<3000)和“长”(> = 3000)。 I want to plot a pie chart that visualizes frequency of these four categories.
我想绘制一个饼图,以可视化这四个类别的频率。
I tried: 我试过了:
rivers[rivers >= 3000] = 'long'
rivers[rivers >= 1500 & rivers < 3000] = 'meidum'
rivers[rivers >= 500 & rivers < 1500]='short'
rivers[rivers < 500] = 'tiny'
It seems the third command has no effect on data and they are the same as before! 似乎第三条命令对数据没有影响,它们与以前相同!
table(rivers)
rivers
500 505 524 525 529 538 540 545 560 570 600 605
2 1 1 2 1 1 1 1 1 1 3 1
610 618 620 625 630 652 671 680 696 710 720 730
1 1 1 1 1 1 1 1 1 1 2 1
735 760 780 800 840 850 870 890 900 906 981 long
2 1 1 1 1 1 1 1 2 1 1 1
meidum tiny
36 62
What is wrong with my commands, and is it the right way to draw a pie chart for them? 我的命令有什么问题,这是为他们绘制饼图的正确方法吗?
The cut
function and easily perform this task: cut
功能并轻松执行此任务:
#random data
rivers<-runif(20, 0, 5000)
#break into desired groups and label
answer<-cut(rivers, breaks=c(0, 500, 1500, 3000, Inf),
labels=c("tiny", "short", "medium", "long"), right=FALSE)
table(answer)
# tiny short medium long
# 1 10 7 2
You are running into this problem because you are trying to assign character values to an integer vector. 您正在遇到此问题,因为您试图将字符值分配给整数向量。 If you work with a character vector instead, it should work:
如果改为使用字符向量,则它应该起作用:
> rivers_size <- as.character(rivers)
> rivers_size[rivers >= 3000] = 'long'
> rivers_size[rivers >= 1500 & rivers < 3000] = 'meidum'
> rivers_size[rivers >= 500 & rivers < 1500]='short'
> rivers_size[rivers < 500] = 'tiny'
> table(rivers_size)
rivers_size
long meidum short tiny
1 5 53 82
> pie(table(rivers_size))
Alternatively, the same thing can be accomplished using cut
(as @Dave2e shows): 另外,可以使用
cut
来完成同一件事(如@ Dave2e所示):
rivers <- cut(datasets::rivers,
breaks = c(0, 500, 1500, 3000, Inf),
labels = c("tiny", "short", "medium", "long"),
right = FALSE)
pie(table(rivers))
Here is another alternative using dplyr::case_when
. 这是使用
dplyr::case_when
另一种选择。 It is more verbose than using cut
but it is also easier generalize. 它比使用
cut
更冗长,但也更容易推广。
library("tidyverse")
set.seed(1234) # for reproducibility
# `case_when` vectorizes multiple `if-else` statements.
rivers <- sample.int(5000, size = 1000, replace = TRUE)
rivers <- case_when(
rivers >= 3000 ~ "long",
rivers >= 1500 ~ "medium",
rivers >= 500 ~ "short",
TRUE ~ "tiny"
)
table(rivers)
#> rivers
#> long medium short tiny
#> 406 303 199 92
Created on 2019-04-10 by the reprex package (v0.2.1) 由reprex软件包 (v0.2.1)创建于2019-04-10
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.