简体   繁体   English

通过引用data.table r中的列值来删除行

[英]remove rows by reference to column values in data.table r

I have a data.table with 47 variables looking at 5007 PhD student outcomes that looks something like this 我有一个data.table,其中包含47个变量,查看5007名博士生的成绩,看起来像这样

sample<-data.table(PHD_STUDENT_ID=c(101:120),STUDY_LOCATION=c("Sydney","Canberra","Sydney","Sydney", 
    "Malaysia", "Malaysia", "CLF", "DRR", "GHS", "HMS", "DRJD", "KLS", "Malaysia",
    "Singapore", "Melbourne", "RD3S", "South Africa", "RME", "Sydney", "Canberra"),
    GRADE=c(51:70))

So the data.table looks something like this 所以data.table看起来像这样

PHD_STUDENT_ID      STUDY_LOCATION  GRADE
1   101             Sydney          51 
2   102             Canberra        52
3   103             Sydney          53
4   104             Sydney          54
5   105             Malaysia        55
6   106             Malaysia        56
7   107             CLF             57
8   108             DRR             58
.........

I need to retain all the rows except for the rows where the Study location is "Malaysia", "South Africa" or "Singapore". 我需要保留所有行,但研究位置为“马来西亚”,“南非”或“新加坡”的行除外。 So basically all the values that are not at the Campuses in those countries. 因此,基本上所有这些国家/地区都不具备的价值观。 I have hundreds of unique values where the study location is just a code for a lab eg "CLF" and "DRR" which I want to retain so I can't just subset by Australia cities. 我有数百个独特的值,其中学习位置只是一个实验室代码,例如“ CLF”和“ DRR”,我想保留这些代码,这样我就不能只是按澳大利亚城市划分子集。

Any advice on how to subset this data table by reference to the values in STUDY_LOCATION are not "Malaysia", "South Africa" or "Singapore" would be greatly appreciated. 不建议您参考“ Study_LOCATION”中的值来对数据表进行子集化的任何建议不是“马来西亚”,“南非”或“新加坡”。

你可以试试

   sample[!STUDY_LOCATION %in% c('Malaysia', 'South Africa', 'Singapore')]

I assume you're learning data.table. 我假设您正在学习data.table。 Thus a data.table way is 因此,data.table的方式是

setkey(sample, STUDY_LOCATION)
sample[!c('Malaysia', 'South Africa', 'Singapore')]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 列参考 data.table function R - Column reference data.table function R R data.table删除如果另一列不适用的情况下重复一列的行 - R data.table remove rows where one column is duplicated if another column is NA 通过R data.table中的ID删除重复的行,但添加一个新列,并将其连接的日期与另一列 - Remove duplicated rows by ID in R data.table, but add a new column with the concatenated dates from another column 根据R中使用data.table的B列,有条件地删除A列中匹配的行 - Conditionally remove rows that match in column A, based on column B in R using data.table R data.table:如何通过引用作为向量提供的列名来更新行? - R data.table: How to update rows by reference with column names supplied as vector? 根据几列的值从R中的data.table中删除行 - Remove rows from data.table in R based on values of several columns 使用 data.table 按组删除特定列中具有前导缺失值的行 - Remove rows with leading missing values in a specific column by group with data.table 根据当前和后续行中的值使用R中的data.table计算新列 - Compute new column based on values in current and following rows with data.table in R 在 r 中的 data.table 中折叠具有互补列数据的行 - Collapse rows with complementary column data in a data.table in r 删除 R data.table 中的双引号边界行 - Remove double quotes bounding rows in R data.table
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM