简体   繁体   English

如何根据 R 中的另一列获取一列的所有值?

[英]How to get all values of one column based on another column in R?

In my first column I have numeric identifiers and the second column is a character column that, for example, identifies the subject's favorite sports.在我的第一列中,我有数字标识符,第二列是一个字符列,例如,标识主题最喜欢的运动。

X1       X2
001      NBA
001      MLS
001      MLB
002      UFC
002      NFL
002      NHL
002      NBA
003      MLB
003      NBA

I have thousands of data points like this and I want the output to be able to show me the unique values in column 2 (X2) if the value in column 1 (X1) is equal to 001 or 002 or 003.我有数千个这样的数据点,如果第 1 列 (X1) 中的值等于 001 或 002 或 003,我希望 output 能够向我显示第 2 列 (X2) 中的唯一值。

Your dataframe:您的 dataframe:

df = structure(list(X1 = c("001", "001", "001", "002", "002", "002", 
"002", "003", "003"), X2 = structure(c(3L, 2L, 1L, 6L, 4L, 5L, 
3L, 1L, 3L), .Label = c("MLB", "MLS", "NBA", "NFL", "NHL", "UFC"
), class = "factor")), row.names = c(NA, -9L), class = "data.frame")

To get unique across all X2 with X1 in 001,002,003:要在所有 X2 中使用 X1 在 001,002,003 中获得唯一性:

unique(df$X2[df$X1 %in% c("001","002","003")])
[1] NBA MLS MLB UFC NFL NHL

To get unique X2 within X1s:要在 X1s 中获得唯一的 X2:

unique(df[df$X1 %in% c("001","002","003"),])
   X1  X2
1 001 NBA
2 001 MLS
3 001 MLB
4 002 UFC
5 002 NFL
6 002 NHL
7 002 NBA
8 003 MLB
9 003 NBA
d <- read.table(header=TRUE, text="X1       X2
001      NBA
001      MLS
001      MLB
002      UFC
002      NFL
002      NHL
002      NBA
003      MLB
003      NBA")

tapply(d$X2, d$X1, unique)

gives a list of length three:给出长度为三的列表:

> str(tapply(d$X2, d$X1, unique))
List of 3
 $ 1: chr [1:3] "NBA" "MLS" "MLB"
 $ 2: chr [1:4] "UFC" "NFL" "NHL" "NBA"
 $ 3: chr [1:2] "MLB" "NBA"
 - attr(*, "dim")= int 3
 - attr(*, "dimnames")=List of 1
  ..$ : chr [1:3] "1" "2" "3"

If the data was like this, for example, where X3 is a data frame containing the information in X1 and X2.如果数据是这样的,例如,其中 X3 是包含 X1 和 X2 中信息的数据框。

X1 <- c(001, 001, 001, 002, 002, 002)
X2 <- c("NBA", "NBA", "NHL", "NBA", "NHL", "NHL")
X3 <- data.frame(X1, X2)

Just filter by what you want X1 to equal and then use distinct(.keep_all = TRUE) to keep all the unique values to generate a data frame of all the unique values in X2 based off a value in X1.只需按您希望 X1 相等的内容进行过滤,然后使用 distinct(.keep_all = TRUE) 保留所有唯一值,以根据 X1 中的值生成 X2 中所有唯一值的数据框。

X3 %>% 
  filter(X1 == 001) %>% 
  distinct(.keep_all = TRUE)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM