[英]select subset of dataset based on one column
I have a dataset with 2 columns我有一个包含 2 列的数据集
text created
1 cant do it with cards either 1/2/2014
2 cant do it with cards either 2/2/2014
3 Coming back home AK 2/2/2014
4 Coming back home AK 5/2/2014
5 gotta try PNNL 1/2/2014
6 Me and my Tart would love to flyLoveisintheAir 5/2/2014
7 Me and my Tart would love to flyLoveisintheAir 6/2/2014
How can I get subset the dataset, based on the unique string of first column?如何根据第一列的唯一字符串获取数据集的子集?
text created
1 cant do it with cards either 1/2/2014
3 Coming back home AK 2/2/2014
5 gotta try PNNL 1/2/2014
6 Me and my Tart would love to flyLoveisintheAir 5/2/2014
structure(list(text = structure(c(1L, 1L, 2L, 2L, 3L, 4L, 4L), .Label = c("cant do it with cards either",
"Coming back home AK", "gotta try PNNL", "Me and my Tart would love to flyLoveisintheAir"
), class = "factor"), created = structure(c(1L, 2L, 2L, 3L, 1L,
3L, 4L), .Label = c("1/2/2014", "2/2/2014", "5/2/2014", "6/2/2014"
), class = "factor")), .Names = c("text", "created"), class = "data.frame", row.names = c(NA,
-7L))
Try using duplicated
and !
尝试使用
duplicated
和!
. . Consider
df
is your data.frame.考虑
df
是你的 data.frame。
> df[!duplicated(df$text), ]
text created
1 cant do it with cards either 1/2/2014
3 Coming back home AK 2/2/2014
5 gotta try PNNL 1/2/2014
6 Me and my Tart would love to flyLoveisintheAir 5/2/2014
there is a lot of possibilities:有很多可能性:
tab[!duplicated(tab$text),]
# with dplyr
filter(tab, !duplicated(text))
hth第
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.