繁体   English   中英

如何检查用空格分隔的单词字符串是否在R的dataframe列中

[英]How to check whether string of words separated with space are in dataframe column in R

我在R中有以下数据框

Client_ID   IT    FMCG   Consumer   Oil_Gas    Finance
  ABC       0     2345      0       4768.90      0
  CFG       234    0        0       2366.54      0
  DEF       1234   0        345     523          2344

现在,我想要打印每个客户拥有多少个行业(均为非零值)。 我可以按照R来做

 df$portfolio_holdings <- simplyfy2array(apply(df[2:6],1,function(x) paste(names(df[2:6])[x!=0],collapse=" ")))

这给了我以下输出。

Client_ID   IT    FMCG   Consumer   Oil_Gas    Finance   portfolio_holdings
  ABC       0     2345      0       4768.90      0        FMCG Oil_Gas
  CFG       234    0        0       2366.54      0        IT Oil_Gas
  DEF       1234   0        345     523          2344     IT Consumer Oil_Gas Finance

我有另一个具有以下列的数据框

  Sectors     Scrip      Target_Price    Call
   FMCG        WER          345           Buy
   IT          CFHG         134           Sell
   Oil_Gas     ERTY         567           Buy
   Consumer    QWER         543           Buy
   Finance     QASD         334           Buy 

现在,我想建议客户随机选择3个以上不在他投资组合中的行业。 最终所需的数据帧将是。

Client_ID   IT   FMCG     Consumer    Oil_Gas    Finance  portfolio    Recommendation
 ABC         0   2345       0         4768.90     0       FMCG Oil_Gas 1:IT|GFHG|134|Sell||2:Consumer|QWER|543|Buy||3:Finance|QASD|334|Buy  
 CFG       234    0         0         2366.54     0       IT Oil_Gas    1:FMCG|WER|345|Buy||2:Consumer|QWER|543|Buy||3:Finance|QASD|334|Buy
 DEF       1234   0         345       523         2344    IT Consumer Oil_Gas Finance   1:FMCG|WER|345|Buy

如何在R中实现这一目标?

样本数据框

client_id <- c('ABC','DEF','ERT')
IT <- c(0,234,1234)
FMCG <- c(2345,0,0)
Consumer <- c(0,0,345)
Oil_Gas <- c(4768,2366,523)
Finance <- c(0,0,2345)


Sectors <- c('FMCG','IT','Oil_Gas','Consumer','Finance')
Scrip <- c('ABC','DFG','ERT','QWE','VGB')
Target <- c(345,134,567,543,334)
call <- c('Buy','Sell','Buy','Buy','Buy')

recom <- data.frame(Sectors,Scrip,Target,call)

df <- data.frame(client_id,IT,FMCG,Consumer,Oil_Gas,Finance)

我没有测试此代码,因为在这里我没有data.frames,但是您可以理解。

假设第二个数据帧名为df2:

  recomend=function(df1,df2){
    df1$Recommendation=NA
    for(i in 1:dim(df1)[1]){
      recm=which(!df2$Sectors%in%unlist(strsplit(df1$portfolio_holdings[i]," ")))
      recm=recm[sample(1:length(recm))[1:3]]
      nval=c()
      for(j in 1:length(recm)){
        nval=c(nval,paste(df2[recm[j],],collapse="|"))
      }
      df1$Recommendation[i]=paste(nval,collapse="||")
    }
    return(df1)
  }

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM