简体   繁体   中英

R: deleting rows in a dataframe by indexing

I just started learning R and I really need some help with cleaning my data. I spent the last 2 days trying to find a solution but nothing seems to work.

I have a dataset called d.new . Here is an example for the relevant rows:

d.new <- cbind(c("abc","abc","abc","def","def","def"),c("yes",NA,NA,"no",NA,NA)) 
colnames(d.new) <- c("observation", "vis") 

I extracted the codes for vis == "yes" like this:

idx_vis <- c(select(filter(d.new, vis == "yes"), c(observation)))

The output looks like this:

$observation
[1] "abc" 

Now I'd like to find all rows, in which the content of my "observation" column is one of the codes in my vector (let's assume it's not just abc but a few hundred codes) and delete them, but without actually hard coding everything . I'd like to use the script for other datasets with different codes, too.

So my desired output would be a dataframe that doesn't contain the rows with certain codes.

My attempt was to write a loop in which I go through all the rows and find and delete those, in which I found one of the codes from idx_vis . I started like this (but I'm not even sure if this makes sense, I never wrote a loop before):

for(i in 1:length(d.new$observation)){  
  i2 <- c([i]:length(idx_vis)) 
  idx_dump <- as.character(which(d.new$observation == "idx_vis[i2]"))
  # then delete the rows from idx_dump from d.new?
} 

It would be great if someone could give me a hint! Thanks in advance!

Merle

试试这个: d.new[d.new$vis == "yes", ]根据“ vis”列中的值选择线。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM