简体   繁体   中英

Generate a Preference Matrix in R?

I'm using r to analyze an undirected network of individuals with ethnicities as attributes. I want to create a tie accounts table, or "preference matrix," a square matrix where values of ethnicity are arrayed on both dimensions, and each cell tells you how many ties correspond to that type of relationship. (so from this you can calculate the probability of one group throwing ties to another group - but I just want to use it as an argument in igraph's preference.game function). here's what I tried:

# I create a variable for ethnicity by assigning the names of my vertices to their corresponding ethnicities

   eth <- atts$Ethnicity[match(V(mahmudNet)$name,atts$Actor)] 

# I create an adjacency matrix from my network data

   mat <- as.matrix(get.adjacency(mahmudNet))

# I create the dimensions for my preference matrix from the Ethnicity values

   eth.value <- unique(sort(eth))

# I create an empty matrix using these dimensions

eth.mat <- array(NA,dim=c(length(eth.value),length(eth.value)))

# I create a function that will populate the empty cells of the matrix

for (i in eth.value){
  for (j in eth.value){
    eth.mat[i,j] <- sum(mat[eth==i,eth==j])
  }
 }

My problem is at the end, I think. I need to figure out an expression that tells R how to populate the cells. the expression I put doesn't seem to work, but I want it so that potentially I could go

a <- sum(mat[eth=="White", eth=="Black"])

And then "a" would return the sum of all the cells in the adjacency matrix that correspond to a White-Black relationship.

Here's a sample of my data:

# data frame with Ethnicity attributes:

                     Actor Ethnicity
1    Sultan Mahmud of Siak         2
2            Daeng Kemboja         1
3  Raja Kecik of Trengganu         1
4                Raja Alam         2
5                Tun Dalam         2
6                Raja Haji         1
7           The Suliwatang         1
8          Punggawa Miskin         1
9          Tengku Selangor         1
10        Tengku Raja Said         1
11         Datuk Bendahara         2
12                     VOC         3
13        King of Selangor         1
14        Dutch at Batavia         3
15            Punggawa Tua         2
16    Raja Tua Encik Andak         1
17      Raja Indera Bungsu         2
18         Sultan of Jambi         2
19            David Boelen         3
20        Datuk Temenggong         2
21      Punggawa Opu Nasti         1

# adjacency matrix with relations

                  Daeng Kemboja Punggawa Opu Nasti Raja Haji Daeng Cellak
Daeng Kemboja                  0                  1         1            1
Punggawa Opu Nasti             1                  0         1            0
Raja Haji                      1                  1         0            0
Daeng Cellak                   1                  0         0            0
Daeng Kecik                    1                  0         0            0
                   Daeng Kecik
Daeng Kemboja                1
Punggawa Opu Nasti           0
Raja Haji                    0
Daeng Cellak                 0
Daeng Kecik                  0

This is a simple job for table , once you have your data in the right shape.

First a sample dataset:

# fake ethnicity data by actor
actor_eth <- data.frame(actor = letters[1:10], 
                        eth = sample(1:3, 10, replace=T))

# fake adjacency matrix
adj_mat <- matrix(rbinom(100, 1, .5), ncol=10)
dimnames(adj_mat) <- list(letters[1:10],  letters[1:10])
# blank out lower triangle & diagonal, 
# so random data is not asymetric & no self-ties
adj_mat[lower.tri(adj_mat)] <- NA
diag(adj_mat) <- NA

Here's our fake adjacency matrix:

   a  b  c  d  e  f  g  h  i  j
a NA  1  1  1  0  0  1  1  0  1
b NA NA  0  1  0  1  0  0  1  0
c NA NA NA  1  1  0  0  1  0  0
d NA NA NA NA  1  0  0  1  1  0
e NA NA NA NA NA  0  0  1  0  1
f NA NA NA NA NA NA  1  1  0  1
g NA NA NA NA NA NA NA  1  1  0
h NA NA NA NA NA NA NA NA  0  0
i NA NA NA NA NA NA NA NA NA  1
j NA NA NA NA NA NA NA NA NA NA

Here's our fake eth table:

   actor eth
1      a   3
2      b   3
3      c   3
4      d   2
5      e   1
6      f   3
7      g   3
8      h   3
9      i   1
10     j   2

So what you want to do is 1) put this in long format, so you have a bunch of rows with a source actor and a target actor, each representing a tie. Then 2) replace the actor name with ethnicity, so you have ties with source/target ethnicity. Then 3) you can just use table to make a cross tab.

# use `melt` to put this in long form, omitting rows showing "non connections"
library(reshape2)
actor_ties <- subset(melt(adj_mat), value==1)

# now replace the actor names with their ethnicities to get create a data.frame
# of ties by ethnicty
eth_ties <- 
  data.frame(source_eth = with(actor_eth, eth[match(actor_ties$Var1, actor)]),
             target_eth = with(actor_eth, eth[match(actor_ties$Var2, actor)]))

# now here's your cross tab
table(eth_ties)

Result:

          target_eth
source_eth 1 2 3
         1 0 2 1
         2 2 0 1
         3 3 5 9

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM