簡體   English   中英

比較數據框並在r中標記匹配項

[英]compare dataframes and mark matches in r

我對R非常陌生。我正在嘗試比較兩個csv文件:csv 1:

 <table id="t01"> <tr> <th>Firstname</th> <th>Lastname</th> <th>Email</th> </tr> <tr> <td>Eve</td> <td>Jackson</td> <td>evejackson@yahoo.co.uk</td> </tr> <tr> <td>Jon</td> <td>Smith</td> <td>johnsmith@gmail.com</td> </tr> </table> 

csv2:

 <table id="t02"> <tr> <th>Firstname</th> <th>Lastname</th> <th>Email</th> </tr> <tr> <td>Jon</td> <td>Smith</td> <td>johnsmith@gmail.com</td> </tr> <tr> <td>Samantha</td> <td>Andrew</td> <td>samanthaandrew@yahoo.co.uk</td> </tr> </table> 

我想要的是有一個代碼,用於比較兩個表之間的“電子郵件”,然后在csv 1中的第4列上輸入“ registered”(匹配)。 像這樣:

 <table id="t03"> <tr> <th>Firstname</th> <th>Lastname</th> <th>Email</th> <th>Status</th> </tr> <tr> <td>Eve</td> <td>Jackson</td> <td>evejackson@yahoo.co.uk</td> </tr> <tr> <td>Jon</td> <td>Smith</td> <td>johnsmith@gmail.com</td> <td> Registered</td> </tr> </table> 

如果不匹配,則代碼應在兩個表之間比較名字和姓氏,如果匹配,則輸入“已注冊”。 盡管可能很簡單,但我不知道如何在R中執行此操作。 謝謝。

除了@akrun建議的merge選項之外,您還可以使用ifelse()語句。 這是一個例子:

df1 <- data.frame(Firstname = c("Eve", "Jon", "Steve"), 
                  Lastname = c("Jackson", "Smith", "Jackson"),
                  Email = c("evejackson@yahoo.co.uk", "johnsmith@gmail.com",
                            "stevejackson@yahoo.com"),
                  stringsAsFactors = FALSE)
# df1
      Firstname Lastname                  Email
1       Eve  Jackson evejackson@yahoo.co.uk
2       Jon    Smith    johnsmith@gmail.com
3     Steve  Jackson stevejackson@yahoo.com

df2 <- data.frame(Firstname = c("Jon", "Samantha", "Steve"), 
                  Lastname = c('Smith', "Andrew", "Jackson"),
                  Email = c("johnsmith@gmail.com", "samanthaandrew@yahoo.co.uk",
                            "stevejackson@yahoo.co.uk"),
                  stringsAsFactors = FALSE)
# df2
  Firstname Lastname                      Email
1       Jon    Smith        johnsmith@gmail.com
2  Samantha   Andrew samanthaandrew@yahoo.co.uk
3     Steve  Jackson   stevejackson@yahoo.co.uk

# check if Emails in df1 are also in df2 and then if Firstname and Lastname are the
# same in df1 and df2 
df1$Status <- ifelse(df1$Email %in% df2$Email, "Registered",
                     ifelse(df1$Firstname == df2$Firstname &
                                    df1$Lastname == df2$Lastname, "Registered",
                            ""))
df1 # output
  Firstname Lastname                  Email     Status
1       Eve  Jackson evejackson@yahoo.co.uk           
2       Jon    Smith    johnsmith@gmail.com Registered
3     Steve  Jackson stevejackson@yahoo.com Registered

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM