简体   繁体   中英

Removing a row from a Dataframe if text in column 1 equals text in column 2 (in r)

I am trying to create unique combinations of all the tickers. I have created a dataframe with all the combinations. However I want to remove all those that are the same. So if the ticker in row 1 column 1 equals the text in row 1 column 2 then I want to either make this NA or remove the row. Therefore you will be left with all the unique combinations.

q <- c("BATS LN EQUITY","DGE LN EQUITY","IMB LN EQUITY","RDSB LN EQUITY")
    p <- c("GBPUSD CURNCY","GOLDS INDEX","DXY CURNCY")
    o <- expand.grid(q=q, p=p)
    o[order(o$q),]
    o <- data.frame(o)
    o$q <- as.character(o$q)
    o$p <- as.character(o$p)
    o <- data.frame(o)



    for(i in 1:nrow(o)){
    if(o[i,1] = o[i,2]){
     o[i,2] = NA 
    }  
     }

Think of it instead as keeping the rows where the two columns are not equal. Try: o[o$q != o$p,] .

Your solution can work too, but you need to using == instead of = in your if . Like so:

for(i in 1:nrow(o)){
  if(o[i,1] == o[i,2]){
    o[i,2] = NA 
  }  
}

This just is slower and not as idiomatic than the first way I mention. And they have different output, but both are in the set of options you say you want.

I'm more Python so the pythonic way would be to use duplicate function in pandas, but for r I would think the unique() function would be better:

unique(o)

Also possible to use Duplicated() function:

df[duplicated(o), ]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM