简体   繁体   English

根据值标准对数据框或矩阵进行子集化

[英]Subsetting Data frame or matrix based on criteria of values

Suppose I have a matrix or a data frame and I want only those values that are greater than 15 and no values between 85 and 90 both inclusive 假设我有一个矩阵或一个数据帧,并且我只想要那些大于15的值,并且不希望包含介于85和90之间(包括两端)的值

a<-matrix(1:100,nrow = 10,  ncol = 10)
rownames(a) <- LETTERS[1:10]
colnames(a) <- LETTERS[1:10]
   A  B  C  D  E  F  G  H  I   J
A  1 11 21 31 41 51 61 71 81  91
B  2 12 22 32 42 52 62 72 82  92
C  3 13 23 33 43 53 63 73 83  93
D  4 14 24 34 44 54 64 74 84  94
E  5 15 25 35 45 55 65 75 85  95
F  6 16 26 36 46 56 66 76 86  96
G  7 17 27 37 47 57 67 77 87  97
H  8 18 28 38 48 58 68 78 88  98
I  9 19 29 39 49 59 69 79 89  99
J 10 20 30 40 50 60 70 80 90 100

Note: You can convert it into dataframe if you know this kind of operation is possible in dataframe 注意:如果您知道在数据框中可以执行这种操作,则可以将其转换为数据框

Now I want My result in such a format that only those values that are greater than 5 and less than 85 retain and all else got deleted and replaced with blank space. 现在,我希望我的结果采用以下格式:仅保留那些大于5且小于85的值,而所有其他值都将被删除并替换为空格。 My desired out is like below 我想要的输出如下

   A  B  C  D  E  F  G  H  I   J
A    11 21 31 41 51 61 71 81  91
B    12 22 32 42 52 62 72 82  92
C    13 23 33 43 53 63 73 83  93
D    14 24 34 44 54 64 74 84  94
E  5 15 25 35 45 55 65 75 85  95
F  6 16 26 36 46 56 66 76     96
G  7 17 27 37 47 57 67 77     97
H  8 18 28 38 48 58 68 78     98
I  9 19 29 39 49 59 69 79     99
J 10 20 30 40 50 60 70 80    100

Is there any kind of function in R which can take my condition and produce the desired result. R中是否有任何一种函数可以满足我的条件并产生所需的结果。 I want to change code according to problem . 我想根据问题更改代码。 I searched it over stack flow but didn't find something like this. 我在堆栈流中搜索了它,但是没有找到类似的东西。 I don't want to format based on rows or column. 我不想基于行或列进行格式化。 I tried a[a> 5 & a!=c(85:90)] but this give me values and looses the structure. 我尝试了[[a> 5&a!= c(85:90)]],但这给了我价值,并失去了结构。

Assuming that the 'a' is matrix , we can assign the values of 'a' %in% 86:90 or | 假设'a'是matrix ,我们可以将'a'的值赋给%in% 86:90或| less than 5 ( a < 5 ) to NA. 小于5( a < 5 )至NA。 Here, I am not assigning it to '' as it will change the class from numeric to character . 在这里,我没有将其分配给''因为它将把类从numeric更改为character Also, assigning to NA would be useful for later processing. 同样,分配给NA对于以后的处理将很有用。

a[a %in% 86:90 | a<5] <- NA

However, if we need it to be '' 但是,如果我们需要它是''

a[a %in% 86:90 | a<5] <- ""

If we are using a data.frame 如果我们使用data.frame

a1 <- as.data.frame(a)
a1[] <- lapply(a1, function(x) replace(x, x %in% 86:90| x <5, ""))
a1
#   A  B  C  D  E  F  G  H  I   J
#A    11 21 31 41 51 61 71 81  91
#B    12 22 32 42 52 62 72 82  92
#C    13 23 33 43 53 63 73 83  93
#D    14 24 34 44 54 64 74 84  94
#E  5 15 25 35 45 55 65 75 85  95
#F  6 16 26 36 46 56 66 76     96
#G  7 17 27 37 47 57 67 77     97
#H  8 18 28 38 48 58 68 78     98
#I  9 19 29 39 49 59 69 79     99
#J 10 20 30 40 50 60 70 80    100

NOTE: This changes the class of each column to character 注意:这会将每个列的class更改为character


In the OP's code, a!=c(85:90) will not work as intended as the 85:90 will recycle to the length of the 'a' and the comparison will be between the corresponding values in the recycled value and 'a'. 在OP的代码中, a!=c(85:90)不会按预期的方式工作,因为85:90将回收到'a'的长度,并且比较将在回收值和'a中进行比较”。 Instead, we need to use %in% for a vector with length > 1. 相反,对于length大于1的vector ,我们需要使用%in%

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM