简体   繁体   中英

Frequency of appearance in list of lists

I have a list of lists, where each list is sorted. What I want to look into is how many times a certain element has appeared in a specific position. For instance, "pnc" has appeared in 2nd position twice and in third position once. My data is structured as bellow:

dput(degree.l)
list(c(schwab = 0, pnc = 0.0344827586206897, jpm = 0.0862068965517241, 
amex = 0.0862068965517241, gs = 0.103448275862069, ms = 0.103448275862069, 
bofa = 0.103448275862069, citi = 0.103448275862069, wf = 0.120689655172414, 
spgl = 0.120689655172414, brk = 0.137931034482759), c(schwab = 0.0166666666666667, 
pnc = 0.05, ms = 0.0666666666666667, spgl = 0.0833333333333333, 
jpm = 0.1, bofa = 0.1, wf = 0.1, amex = 0.1, gs = 0.116666666666667, 
brk = 0.116666666666667, citi = 0.15), c(schwab = 0.0428571428571429, 
gs = 0.0714285714285714, pnc = 0.0714285714285714, citi = 0.0857142857142857, 
amex = 0.0857142857142857, spgl = 0.0857142857142857, jpm = 0.1, 
brk = 0.1, ms = 0.114285714285714, wf = 0.114285714285714, bofa = 0.128571428571429
))

We can loop over the list with sapply , get the position index of the names that match es the 'pnc', and get the table on the output vector - returns the frequency of positions of 'pnc'

get_positions <- function(lst1, to_match) {
   table(sapply(lst1, function(x) match(to_match, names(x))))
}

-testing

get_positions(degree.l, 'pnc')
2 3 
2 1 

If we want to do this for all the unique names

nm1 <- unique(names(unlist(degree.l)))
out <- lapply(nm1, get_positions, lst1 = degree.l)
names(out) <- nm1

-output

out
$schwab

1 
3 

$pnc

2 3 
2 1 

$jpm

3 5 7 
1 1 1 

$amex

4 5 8 
1 1 1 

$gs

2 5 9 
1 1 1 

$ms

3 6 9 
1 1 1 

$bofa

 6  7 11 
 1  1  1 

$citi

 4  8 11 
 1  1  1 

$wf

 7  9 10 
 1  1  1 

$spgl

 4  6 10 
 1  1  1 

$brk

 8 10 11 
 1  1  1 

Another base R option using aggregate + stack + unstack + table

unstack(
    aggregate(
        . ~ ind,
        do.call(
            rbind,
            Map(function(x) cbind(stack(x)["ind"], pos = seq_along(x)), degree.1)
        ),
        table
    ),
    pos ~ ind
)

gives

$schwab

1
3

$pnc

2 3
2 1

$jpm

3 5 7
1 1 1

$amex

4 5 8
1 1 1

$gs

2 5 9 
1 1 1

$ms

3 6 9
1 1 1

$bofa

 6  7 11
 1  1  1

$citi

 4  8 11
 1  1  1

$wf

 7  9 10
 1  1  1

$spgl

 4  6 10
 1  1  1

$brk

 8 10 11
 1  1  1

Here's a way to generate a contingency table of the desired info:

table(res <- sapply(degree.1, names), pos = row(res))

        pos
         1 2 3 4 5 6 7 8 9 10 11
  amex   0 0 0 1 1 0 0 1 0  0  0
  bofa   0 0 0 0 0 1 1 0 0  0  1
  brk    0 0 0 0 0 0 0 1 0  1  1
  citi   0 0 0 1 0 0 0 1 0  0  1
  gs     0 1 0 0 1 0 0 0 1  0  0
  jpm    0 0 1 0 1 0 1 0 0  0  0
  ms     0 0 1 0 0 1 0 0 1  0  0
  pnc    0 2 1 0 0 0 0 0 0  0  0
  schwab 3 0 0 0 0 0 0 0 0  0  0
  spgl   0 0 0 1 0 1 0 0 0  1  0
  wf     0 0 0 0 0 0 1 0 1  1  0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM