I need help finding a solution to structure data for use with the r network package?
I have a list, author_list, containing several authors per character vector, eg:
document_authors1 = c("King, Stephen", "Martin, George", "Clancy, Tom")
document_authors2 = c("Clancy, Tom", "Patterson, James", "Stine, RL", "King, Stephen")
document_authors3 = c("Clancy, Tom", "Patterson, James", "Stine, RL", "King, Stephen")
author_list = list(document_authors1, document_authors2, document_authors3)
author_list
[[1]] [1] "King, Stephen" "Martin, George" "Clancy, Tom"
[[2]] [1] "Clancy, Tom" "Patterson, James" "Stine, RL" "King, Stephen"
[[3]] [1] "Clancy, Tom" "Patterson, James" "Stine, RL" "King, Stephen"
I need to create a data frame based on author_list within which there are three columns. The first two columns have the author names where col1 has a row value of one author and col2 has a row value of another author, and the third column, called, co-occurrence, provides the frequency by which the author pair (col1 and col2, row 1) occur. For example,
col1 col2 co-occurrence
1 King, Stephen Patterson, James 2
2 Martin, George Clancy, Tom 1
Etc…
I have been trying to find a function from a package to do this but no luck. I've also been trying to piece together a solution step-by-step but this appears to be alluding me. Hopefully it's easier than I think. Any advice or suggestions would be greatly appreciated.
I am not entirely sure this is what you are interested in, but hope this will be helpful.
library(dplyr)
# Only include elements in list with more than one author
author_list <- author_list[lengths(author_list)>1]
# Identify every combination of pairs of authors for each element in list
mat <- do.call(rbind, lapply(1:length(author_list), function(x) t(combn(author_list[[x]],2))))
# Within each row sort alphabetically
mat <- t(apply(mat, 1, sort))
# Count up pairs of authors
as.data.frame(mat) %>%
group_by_all() %>%
summarise(count = n())
# A tibble: 8 x 3
# Groups: V1 [3]
V1 V2 count
<fct> <fct> <int>
1 Clancy, Tom King, Stephen 3
2 Clancy, Tom Martin, George 1
3 Clancy, Tom Patterson, James 2
4 Clancy, Tom Stine, R.L. 2
5 King, Stephen Martin, George 1
6 King, Stephen Patterson, James 2
7 King, Stephen Stine, R.L. 2
8 Patterson, James Stine, R.L. 2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.