简体   繁体   English

R将值替换为垃圾箱

[英]R replace values with bins

I have a df with integer values. 我有一个带整数值的df。 For purposes of classification, I'd like to replace this df with a simpler one that has pre-determined intervals instead of integers. 出于分类的目的,我想用一个具有预定间隔而不是整数的简单df替换此df。 How do I do this efficiently? 我如何有效地做到这一点? An example is below: 下面是一个示例:

df: df:

   1   2   3
1  5   3   0 
2  1   10  12
3  3   0   10

transforms into: 转换为:

   1      2      3
1  [3-5]  [3-5]  [0-2]
2  [0-2]  [10-12][10-12]
3  [3-5]  [0-2]  [10-12]

Is df a data frame or a matrix? df是数据帧还是矩阵? The name suggests the former, but the way you describe it suggests the latter. 该名称暗示前者,但您描述它的方式暗示后者。

If it's a matrix: 如果是矩阵:

df2 <- cut(df, c(0, 2, 5, 9 12))
dim(df2) <- dim(df)

If it's a data frame: 如果是数据框:

df[] <- lapply(df, cut, c(0, 2, 5, 9, 12))

In addition to Hong, who proposes a good solution, I found something quite useful in ggplot2: 除了Hong提出了一个好的解决方案外,我在ggplot2中发现了一些非常有用的东西:

cut_interval - make n groups with equal range cut_interval使n个组的范围相等

cut_number - make n groups with approximately equal observations cut_number使n个组的观察值大致相等

cut_width - make n groups of equal width cut_width使n个等宽的组

In my opinion these functions offer more flexibility and are easier to understand than the base cut function. 我认为这些功能比基本切割功能更具灵活性,更易于理解。 Note that the functions return factors instead of a matrix. 请注意,函数返回因子而不是矩阵。

You could use something like this: 您可以使用如下形式:

df <- matrix(c(5,3,0,1,10,12,3,0,10), nrow=3)
m.df <- melt(df)
m.df$value <- cut_width(m.df$value, width=2, boundary=0)

This will return 这将返回

   Var1 Var2   value
1    1    1   (4,6]
2    2    1   (2,4]
3    3    1   [0,2]
4    1    2   [0,2]
5    2    2  (8,10]
6    3    2 (10,12]
7    1    3   (2,4]
8    2    3   [0,2]
9    3    3  (8,10]

If needed, you can cast it back to a square matrix: 如果需要,可以将其转换回方形矩阵:

df.bins <- acast(m.df, Var1~Var2)

Finally giving: 最后给出:

  1     2       3     
1 (4,6] [0,2]   (2,4] 
2 (2,4] (8,10]  [0,2] 
3 [0,2] (10,12] (8,10]
Levels: [0,2] (2,4] (4,6] (6,8] (8,10] (10,12]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM