简体   繁体   English

如何在R中对向量进行分类以绘制饼图

[英]How to categorize a vector in R to draw a pie chart

I want to categorize rivers dataset into “tiny” (<500), “short” (<1500), “medium” (<3000) and “long” (>=3000). 我想将河流数据集分为“小”(<500),“短”(<1500),“中”(<3000)和“长”(> = 3000)。 I want to plot a pie chart that visualizes frequency of these four categories. 我想绘制一个饼图,以可视化这四个类别的频率。

I tried: 我试过了:

 rivers[rivers >= 3000] = 'long'
 rivers[rivers >= 1500 & rivers < 3000] = 'meidum'
 rivers[rivers >= 500 & rivers < 1500]='short'
 rivers[rivers < 500] = 'tiny'

It seems the third command has no effect on data and they are the same as before! 似乎第三条命令对数据没有影响,它们与以前相同!

table(rivers)
rivers
   500    505    524    525    529    538    540    545    560    570    600    605 
     2      1      1      2      1      1      1      1      1      1      3      1 
   610    618    620    625    630    652    671    680    696    710    720    730 
     1      1      1      1      1      1      1      1      1      1      2      1 
   735    760    780    800    840    850    870    890    900    906    981   long 
     2      1      1      1      1      1      1      1      2      1      1      1 
meidum   tiny 
    36     62 

What is wrong with my commands, and is it the right way to draw a pie chart for them? 我的命令有什么问题,这是为他们绘制饼图的正确方法吗?

The cut function and easily perform this task: cut功能并轻松执行此任务:

#random data
rivers<-runif(20, 0, 5000)

#break into desired groups and label
answer<-cut(rivers, breaks=c(0, 500, 1500, 3000, Inf), 
    labels=c("tiny", "short", "medium", "long"), right=FALSE) 

table(answer)
# tiny  short medium   long 
#    1     10      7      2 

You are running into this problem because you are trying to assign character values to an integer vector. 您正在遇到此问题,因为您试图将字符值分配给整数向量。 If you work with a character vector instead, it should work: 如果改为使用字符向量,则它应该起作用:

> rivers_size <- as.character(rivers)
> rivers_size[rivers >= 3000] = 'long'
> rivers_size[rivers >= 1500 & rivers < 3000] = 'meidum'
> rivers_size[rivers >= 500 & rivers < 1500]='short'
> rivers_size[rivers < 500] = 'tiny'
> table(rivers_size)
rivers_size
  long meidum  short   tiny 
     1      5     53     82 
> pie(table(rivers_size))

饼形图

Alternatively, the same thing can be accomplished using cut (as @Dave2e shows): 另外,可以使用cut来完成同一件事(如@ Dave2e所示):

rivers <- cut(datasets::rivers,
              breaks = c(0, 500, 1500, 3000, Inf), 
              labels = c("tiny", "short", "medium", "long"),
              right = FALSE)
pie(table(rivers))

Here is another alternative using dplyr::case_when . 这是使用dplyr::case_when另一种选择。 It is more verbose than using cut but it is also easier generalize. 它比使用cut更冗长,但也更容易推广。

library("tidyverse")

set.seed(1234) # for reproducibility

# `case_when` vectorizes multiple `if-else` statements.
rivers <- sample.int(5000, size = 1000, replace = TRUE)
rivers <- case_when(
  rivers >= 3000 ~ "long",
  rivers >= 1500 ~ "medium",
  rivers >= 500  ~ "short",
  TRUE ~ "tiny"
)
table(rivers)
#> rivers
#>   long medium  short   tiny 
#>    406    303    199     92

Created on 2019-04-10 by the reprex package (v0.2.1) reprex软件包 (v0.2.1)创建于2019-04-10

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM