簡體   English   中英

R 將一個 tibble 中的兩個字符列與另一個 tibble 中的兩個其他字符列匹配

[英]R match two character columns in one tibble with two other character columns in another tibble

假設我有兩個對象,

mixed
# A tibble: 7 x 2
  genus        epithet   
  <chr>        <chr>     
1 Vincetoxicum nigrum    
2 Rosa         multiflora
3 Quercus      rubra     
4 Acer         saccharum 
5 Rosa         pendula   
6 Vincetoxicum nigrum    
7 Vincetoxicum nigrum

invasives
# A tibble: 4 x 2
  genus        epithet   
  <chr>        <chr>     
1 Larix        pendula   
2 Picea        abies     
3 Rosa         multiflora
4 Vincetoxicum nigrum

我想檢查“混合”的兩個列是否與“侵入”的兩個列匹配,並獲得一個索引,讓我可以從“混合”中提取那些匹配。 注意,“pendula”在“mixed”和“invasives”中都在“epithet”中,但其對應的第一列對應行“invasives”中有“Larix”,“mixed”中有“Rosa”,所以不包括在內在最終產品中。

因此,一旦創建了該索引,我想我想運行:

columns_matched <- mixed[index,]

產生:

columns_matched
# A tibble: 4 x 2
  genus        epithet   
  <chr>        <chr>     
1 Vincetoxicum nigrum    
2 Rose         multiflora 
3 Vincetoxicum nigrum    
4 Vincetoxicum nigrum 

csv 版本的表格:

genus,epithet
Vincetoxicum,nigrum
Rosa,multiflora
Quercus,rubra
Acer,saccharum
Rosa,pendula
Vincetoxicum,nigrum
Vincetoxicum,nigrum

genus,epithet
Larix,pendula
Picea,abies
Rosa,multiflora
Vincetoxicum,nigrum

謝謝。

想到的最簡單的答案就是inner_join你的數據集。 這樣,只剩下相同的行:

library(tidyverse)
mixed <- read_csv('genus,epithet
Vincetoxicum,nigrum
Rosa,multiflora
Quercus,rubra
Acer,saccharum
Rosa,pendula
Vincetoxicum,nigrum
Vincetoxicum,nigrum')
#> Rows: 7 Columns: 2
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (2): genus, epithet
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

invasives <- read_csv('genus,epithet
Larix,pendula
Picea,abies
Rosa,multiflora
Vincetoxicum,nigrum')
#> Rows: 4 Columns: 2
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (2): genus, epithet
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.


mixed %>% 
  inner_join(invasives)
#> Joining, by = c("genus", "epithet")
#> # A tibble: 4 × 2
#>   genus        epithet   
#>   <chr>        <chr>     
#> 1 Vincetoxicum nigrum    
#> 2 Rosa         multiflora
#> 3 Vincetoxicum nigrum    
#> 4 Vincetoxicum nigrum

如果你真的想擁有那個索引,你可以在你的混合小標題中添加一個虛擬列:

index <- mixed %>% 
  mutate(index = seq_along(genus)) %>% 
  inner_join(invasives) %>% 
  pull(index)
#> Joining, by = c("genus", "epithet")

mixed[index,]
#> # A tibble: 4 × 2
#>   genus        epithet   
#>   <chr>        <chr>     
#> 1 Vincetoxicum nigrum    
#> 2 Rosa         multiflora
#> 3 Vincetoxicum nigrum    
#> 4 Vincetoxicum nigrum

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM