[英]R match two character columns in one tibble with two other character columns in another tibble
假設我有兩個對象,
mixed
# A tibble: 7 x 2
genus epithet
<chr> <chr>
1 Vincetoxicum nigrum
2 Rosa multiflora
3 Quercus rubra
4 Acer saccharum
5 Rosa pendula
6 Vincetoxicum nigrum
7 Vincetoxicum nigrum
和
invasives
# A tibble: 4 x 2
genus epithet
<chr> <chr>
1 Larix pendula
2 Picea abies
3 Rosa multiflora
4 Vincetoxicum nigrum
我想檢查“混合”的兩個列是否與“侵入”的兩個列匹配,並獲得一個索引,讓我可以從“混合”中提取那些匹配。 注意,“pendula”在“mixed”和“invasives”中都在“epithet”中,但其對應的第一列對應行“invasives”中有“Larix”,“mixed”中有“Rosa”,所以不包括在內在最終產品中。
因此,一旦創建了該索引,我想我想運行:
columns_matched <- mixed[index,]
產生:
columns_matched
# A tibble: 4 x 2
genus epithet
<chr> <chr>
1 Vincetoxicum nigrum
2 Rose multiflora
3 Vincetoxicum nigrum
4 Vincetoxicum nigrum
csv 版本的表格:
genus,epithet
Vincetoxicum,nigrum
Rosa,multiflora
Quercus,rubra
Acer,saccharum
Rosa,pendula
Vincetoxicum,nigrum
Vincetoxicum,nigrum
genus,epithet
Larix,pendula
Picea,abies
Rosa,multiflora
Vincetoxicum,nigrum
謝謝。
想到的最簡單的答案就是inner_join
你的數據集。 這樣,只剩下相同的行:
library(tidyverse)
mixed <- read_csv('genus,epithet
Vincetoxicum,nigrum
Rosa,multiflora
Quercus,rubra
Acer,saccharum
Rosa,pendula
Vincetoxicum,nigrum
Vincetoxicum,nigrum')
#> Rows: 7 Columns: 2
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (2): genus, epithet
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
invasives <- read_csv('genus,epithet
Larix,pendula
Picea,abies
Rosa,multiflora
Vincetoxicum,nigrum')
#> Rows: 4 Columns: 2
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (2): genus, epithet
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
mixed %>%
inner_join(invasives)
#> Joining, by = c("genus", "epithet")
#> # A tibble: 4 × 2
#> genus epithet
#> <chr> <chr>
#> 1 Vincetoxicum nigrum
#> 2 Rosa multiflora
#> 3 Vincetoxicum nigrum
#> 4 Vincetoxicum nigrum
如果你真的想擁有那個索引,你可以在你的混合小標題中添加一個虛擬列:
index <- mixed %>%
mutate(index = seq_along(genus)) %>%
inner_join(invasives) %>%
pull(index)
#> Joining, by = c("genus", "epithet")
mixed[index,]
#> # A tibble: 4 × 2
#> genus epithet
#> <chr> <chr>
#> 1 Vincetoxicum nigrum
#> 2 Rosa multiflora
#> 3 Vincetoxicum nigrum
#> 4 Vincetoxicum nigrum
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.