简体   繁体   中英

How to check whether string of words separated with space are in dataframe column in R

I have a following dataframe in R

Client_ID   IT    FMCG   Consumer   Oil_Gas    Finance
  ABC       0     2345      0       4768.90      0
  CFG       234    0        0       2366.54      0
  DEF       1234   0        345     523          2344

Now,what I want is to print how many industries each client has holdings(which are all non zero values). I can do this by following in R.

 df$portfolio_holdings <- simplyfy2array(apply(df[2:6],1,function(x) paste(names(df[2:6])[x!=0],collapse=" ")))

Which gives me following output.

Client_ID   IT    FMCG   Consumer   Oil_Gas    Finance   portfolio_holdings
  ABC       0     2345      0       4768.90      0        FMCG Oil_Gas
  CFG       234    0        0       2366.54      0        IT Oil_Gas
  DEF       1234   0        345     523          2344     IT Consumer Oil_Gas Finance

I have another dataframe which have following columns

  Sectors     Scrip      Target_Price    Call
   FMCG        WER          345           Buy
   IT          CFHG         134           Sell
   Oil_Gas     ERTY         567           Buy
   Consumer    QWER         543           Buy
   Finance     QASD         334           Buy 

Now,what I want is to recommend clients random 3 above sectors which he is not holding in his portfolio. Final desired dataframe would be.

Client_ID   IT   FMCG     Consumer    Oil_Gas    Finance  portfolio    Recommendation
 ABC         0   2345       0         4768.90     0       FMCG Oil_Gas 1:IT|GFHG|134|Sell||2:Consumer|QWER|543|Buy||3:Finance|QASD|334|Buy  
 CFG       234    0         0         2366.54     0       IT Oil_Gas    1:FMCG|WER|345|Buy||2:Consumer|QWER|543|Buy||3:Finance|QASD|334|Buy
 DEF       1234   0         345       523         2344    IT Consumer Oil_Gas Finance   1:FMCG|WER|345|Buy

How Can I achieve this in R?

Sample Data frames

client_id <- c('ABC','DEF','ERT')
IT <- c(0,234,1234)
FMCG <- c(2345,0,0)
Consumer <- c(0,0,345)
Oil_Gas <- c(4768,2366,523)
Finance <- c(0,0,2345)


Sectors <- c('FMCG','IT','Oil_Gas','Consumer','Finance')
Scrip <- c('ABC','DFG','ERT','QWE','VGB')
Target <- c(345,134,567,543,334)
call <- c('Buy','Sell','Buy','Buy','Buy')

recom <- data.frame(Sectors,Scrip,Target,call)

df <- data.frame(client_id,IT,FMCG,Consumer,Oil_Gas,Finance)

I did not tested this code, 'cause I don't have the data.frames here, but you can get the idea.

Assuming that the second data frame is named df2:

  recomend=function(df1,df2){
    df1$Recommendation=NA
    for(i in 1:dim(df1)[1]){
      recm=which(!df2$Sectors%in%unlist(strsplit(df1$portfolio_holdings[i]," ")))
      recm=recm[sample(1:length(recm))[1:3]]
      nval=c()
      for(j in 1:length(recm)){
        nval=c(nval,paste(df2[recm[j],],collapse="|"))
      }
      df1$Recommendation[i]=paste(nval,collapse="||")
    }
    return(df1)
  }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM