简体   繁体   中英

Python - How to check the combination of numbers by frequency

Let's have for example the following data.

 h: [Num1, Num2, Num3, Num4, Num5, Num6]
 a: [1,       2,    3,    4,    5,    6]
 b: [1,       2,    7,    8,    9,   10]
 c: [1,       2,    3,    6,    8,   10]

Now, let's say I want to see combinations of 2+ ordered by frequency.

Let's take number:1 for example, it appears in all our three rows a, b, c.

When 1 is "used", it's usually paired with 2 (3/3), followed by 3, 6, 8, 10 (2/3). In other words, when 1 is "used" there is a chance it looks something like this:

 [1, 2, x, y, z, t]
 [1, 2, 3, x, y, z]
 [1, 2, 6, x, y, z]
 .
 .
 .
 [1, 8, x, y, z, t]
 [1, 10, x, y, z, t]
 [1, 2, 3, 6, 8, 10]

Order does not matter. x, y, z, t could be any given number. Duplicates are not present/allowed.

I have a data frame with this format and want to see what other integers come in combination with, for example, 44.

For example:

 44 was paired with 11, 350 times out of 2000
 44 was paired with 27, 290 times out of 2000
 44 was paired with 35, 180 times out of 2000
 .
 .
 .
 44 was paired with 2, 5 times out of 2000

I have the frequency of which every number occurs in each column, I just can't figure out how to continue this.

Looking forward to ideas and questions. Thank you!

You could use Counter from the itertools module

from itertools import combinations
from collections import Counter
data = [[1, 2, 3],[1, 2, 5],[1, 3, 8],[2, 5, 8]]
pairings = Counter(
    pair for row in data 
    for pair in combinations(sorted(row), 2)
)

The Counter object is dictionary like.

Counter({
    (1, 2): 2, 
    (1, 3): 2, 
    (2, 5): 2, 
    (2, 3): 1, 
    (1, 5): 1, 
    (1, 8): 1, 
    (3, 8): 1, 
    (2, 8): 1, 
    (5, 8): 1
})

You can get the count of a specific pair like this:

>>> pairings[1,2] 
2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM