[英]Merge two dataframes in R column-wise and sort columns by one value
我在 R 中有兩個數據框,類似於以下示例:
Dataframe 1
|word |a1 |a2 |a3 |...|
|apple |0.5|0.3|0.2|...|
|pear |0.2|0.2|0.6|...|
|banana|0.6|0.1|0.3|...|
|cherry|0.4|0.5|0.1|...|
Dataframe 2
|a1 |a2 | a3 |...|
|banana |cherry |pear |...|
|apple |apple |banana |...|
|cherry |pear |apple |...|
|pear |banana |cherry |...|
The names in Dataframe 2 are sorted by their value in Dataframe 1 - these are the top terms I got from the textmineR
package with the GetTopTerms
function from my model. 但是,我不知道如何將我擁有的 phi 值與該值所屬的每個單詞結合起來。 換句話說,我想要的 output 是上面兩個數據幀的組合 - 其中 phi 值在每一列中從最高到最低列出,如下所示:
|a1_term |a1_phi | a2_term |a2_phi | a3_term |a3_phi |...|
|banana |0.6 |cherry |0.5 |pear |0.6 |...|
|apple |0.5 |apple |0.3 |banana |0.3 |...|
|cherry |0.4 |pear |0.2 |apple |0.2 |...|
|pear |0.2 |banana |0.1 |cherry |0.1 |...|
是否有一個簡單的 function 來合並這兩個表,如上所示,以及在合並時將每個 phi 值從最低到最高排序。 謝謝!
這是使用dplyr
和reshape2
的解決方案。 如果按 phi 排序,則不需要第二個數據框。 這里, df
是第一個數據幀。
library(dplyr)
library(reshape2)
library(tidyselect)
do.call("cbind", melt(df) %>%
split(.$variable) %>%
lapply(function(x) x %>% arrange(-value))) %>%
select(!ends_with("variable"))
#> a1.word a1.value a2.word a2.value a3.word a3.value
#> 1 banana 0.6 cherry 0.5 pear 0.6
#> 2 apple 0.5 apple 0.3 banana 0.3
#> 3 cherry 0.4 pear 0.2 apple 0.2
#> 4 pear 0.2 banana 0.1 cherry 0.1
數據
df <- structure(list(word = c("apple", "pear", "banana", "cherry"),
a1 = c(0.5, 0.2, 0.6, 0.4), a2 = c(0.3, 0.2, 0.1, 0.5), a3 = c(0.2,
0.6, 0.3, 0.1)), class = "data.frame", row.names = c(NA, -4L))
df
#> word a1 a2 a3
#> 1 apple 0.5 0.3 0.2
#> 2 pear 0.2 0.2 0.6
#> 3 banana 0.6 0.1 0.3
#> 4 cherry 0.4 0.5 0.1
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.