過濾數據集中的行以獲取 r 中的不同單詞

Question

目標：過濾數據集中的行，以便只保留不同的單詞目前，我使用inner_join保留 2 個數據集中的行，這使我在這個數據集中的行重復。

嘗試1：我嘗試使用distinct僅保留那些唯一的行，但這沒有奏效。 我可能使用不正確。

到目前為止，這是我的代碼； output 以 png 格式附加 ：


# join warriner emotion lemmas by `word` column in collocations data frame to see how many word matches there are

warriner2 <- dplyr::inner_join(warriner, coll, by = "word") # join data; retain only rows in both sets (works both ways)
warriner2 <- distinct(warriner2)
warriner2

coll2 <- dplyr::semi_join(coll, warriner, by = "word") # join all rows in a that have a match in b

# There are 8166 lemma matches (including double-ups)
# There are XXX unique lemma matches

Answer 1

你可以試試：

library(dplyr)

warriner2 <- inner_join(warriner, coll, by = "word") %>%
                distinct(word, .keep_all = TRUE)

Answer 2

為了進一步澄清 Ronak 的答案，這里有一個帶有一些模擬數據的示例。 請注意，如果您需要，您可以在 pipe 的末尾使用 distinct() 來保留不同的列。 您的錯誤很可能已經發生，因為您執行了兩次操作，並且兩次都將結果分配給了相同的名稱（warriner2）。

library(dplyr)

# Here's a couple sample tibbles
name <- c("cat", "dog", "parakeet")

df1 <- tibble(
        x = sample(5, 99, rep = TRUE),
        y = sample(5, 99, rep = TRUE),
        name = rep(name, times = 33))
df2 <- tibble(
        x = sample(5, 99, rep = TRUE),
        y = sample(5, 99, rep = TRUE),
        name = rep(name, times = 33))

# It's much less confusing if you do this in one pipe
p <- df1 %>%
        inner_join(df2, by = "name") %>%
        distinct()

過濾數據集中的行以獲取 r 中的不同單詞

問題描述

2 個解決方案

解決方案1
0 2021-05-05 01:39:34

解決方案2
0 2021-05-05 02:03:35

過濾數據集中的行以獲取 r 中的不同單詞

問題描述

2 個解決方案

解決方案1 0 2021-05-05 01:39:34

解決方案2 0 2021-05-05 02:03:35

解決方案1
0 2021-05-05 01:39:34

解決方案2
0 2021-05-05 02:03:35