[英]Remove rows from a dataframe based on a value in one column
I have a dataframe (imported from a csv file) as follows 我有一个数据框(从csv文件导入),如下所示
moose loose hoose
2 3 8
1 3 4
5 4 2
10 1 4
The R code should generate a mean column and then I would like to remove all rows where the value of the mean is <4 so that I end up with: R代码应生成一个均值列,然后我要删除均值小于4的所有行,从而得到以下结果:
moose loose hoose mean
2 3 8 4.3
1 3 4 2.6
5 4 2 3.6
10 1 4 5
which should then end up as: 最终应为:
moose loose hoose mean
2 3 8 4.3
10 1 4 5
How can I do this in R? 我如何在R中做到这一点?
dat2 <- subset(transform(dat1, Mean=round(rowMeans(dat1),1)), Mean >=4)
dat2
# moose loose hoose Mean
#1 2 3 8 4.3
#4 10 1 4 5.0
Using data.table
使用data.table
setDT(dat1)[, Mean:=rowMeans(.SD)][Mean>=4]
# moose loose hoose Mean
#1: 2 3 8 4.333333
#2: 10 1 4 5.000000
I will assume your data is called d
. 我假设您的数据称为d
。 Then you run: 然后,您运行:
d$mean <- rowMeans(d) ## create a new column with the mean of each row
d <- d[d$mean >= 4, ] ## filter the data using this column in the condition
I suggest you read about creating variables in a data.frame
, and filtering data. 我建议您阅读有关在data.frame
创建变量和过滤数据的信息。 These are very common operations that you can use in many many contexts. 这些是非常常见的操作,可以在许多情况下使用。
You could also use within
, which allows you to assign/remove columns and then returns the transformed data. 您还可以within
使用,它允许您分配/删除列,然后返回转换后的数据。 Start with df
, 以df
开头
> df
# moose loose hoose
#1 2 3 8
#2 1 3 4
#3 5 4 2
#4 10 1 4
> within(d <- df[rowMeans(df) > 4, ], { means <- round(rowMeans(d), 1) })
# moose loose hoose means
#1 2 3 8 4.3
#4 10 1 4 5.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.