[英]How to get all values of one column based on another column in R?
In my first column I have numeric identifiers and the second column is a character column that, for example, identifies the subject's favorite sports.在我的第一列中,我有数字标识符,第二列是一个字符列,例如,标识主题最喜欢的运动。
X1 X2
001 NBA
001 MLS
001 MLB
002 UFC
002 NFL
002 NHL
002 NBA
003 MLB
003 NBA
I have thousands of data points like this and I want the output to be able to show me the unique values in column 2 (X2) if the value in column 1 (X1) is equal to 001 or 002 or 003.我有数千个这样的数据点,如果第 1 列 (X1) 中的值等于 001 或 002 或 003,我希望 output 能够向我显示第 2 列 (X2) 中的唯一值。
Your dataframe:您的 dataframe:
df = structure(list(X1 = c("001", "001", "001", "002", "002", "002",
"002", "003", "003"), X2 = structure(c(3L, 2L, 1L, 6L, 4L, 5L,
3L, 1L, 3L), .Label = c("MLB", "MLS", "NBA", "NFL", "NHL", "UFC"
), class = "factor")), row.names = c(NA, -9L), class = "data.frame")
To get unique across all X2 with X1 in 001,002,003:要在所有 X2 中使用 X1 在 001,002,003 中获得唯一性:
unique(df$X2[df$X1 %in% c("001","002","003")])
[1] NBA MLS MLB UFC NFL NHL
To get unique X2 within X1s:要在 X1s 中获得唯一的 X2:
unique(df[df$X1 %in% c("001","002","003"),])
X1 X2
1 001 NBA
2 001 MLS
3 001 MLB
4 002 UFC
5 002 NFL
6 002 NHL
7 002 NBA
8 003 MLB
9 003 NBA
d <- read.table(header=TRUE, text="X1 X2
001 NBA
001 MLS
001 MLB
002 UFC
002 NFL
002 NHL
002 NBA
003 MLB
003 NBA")
tapply(d$X2, d$X1, unique)
gives a list of length three:给出长度为三的列表:
> str(tapply(d$X2, d$X1, unique))
List of 3
$ 1: chr [1:3] "NBA" "MLS" "MLB"
$ 2: chr [1:4] "UFC" "NFL" "NHL" "NBA"
$ 3: chr [1:2] "MLB" "NBA"
- attr(*, "dim")= int 3
- attr(*, "dimnames")=List of 1
..$ : chr [1:3] "1" "2" "3"
If the data was like this, for example, where X3 is a data frame containing the information in X1 and X2.如果数据是这样的,例如,其中 X3 是包含 X1 和 X2 中信息的数据框。
X1 <- c(001, 001, 001, 002, 002, 002)
X2 <- c("NBA", "NBA", "NHL", "NBA", "NHL", "NHL")
X3 <- data.frame(X1, X2)
Just filter by what you want X1 to equal and then use distinct(.keep_all = TRUE) to keep all the unique values to generate a data frame of all the unique values in X2 based off a value in X1.只需按您希望 X1 相等的内容进行过滤,然后使用 distinct(.keep_all = TRUE) 保留所有唯一值,以根据 X1 中的值生成 X2 中所有唯一值的数据框。
X3 %>%
filter(X1 == 001) %>%
distinct(.keep_all = TRUE)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.