简体   繁体   中英

Fastest approach to finding the most common first and second value of tuples in an N-dimensional array of tuples in Python

I have M number of N-dimensional arrays of tuples and I'd like to find most frequent value in the first elements of the tuples and the second elements, here's a single N-dimen array demo data:

data = [[(2, 0), (0, 3), (0, 2), (0, 3), (2, 4), (0, 3), (0, 3), (2, 7)],
        [(2, 0), (0, 1), (2, 0), (0, 1), (3, 4), (2, 7), (2, 0), (2, 7)],
        [(2, 2), (2, 3), (2, 2), (2, 3), (2, 2), (2, 3), (2, 3), (2, 2)],
        [(2, 1), (2, 1), (3, 2), (2, 1), (2, 1), (3, 3), (2, 1), (2, 1)]]

Here's my current implementation:

from collections import Counter


def find_most_common_values(data):
# Flatten the n-dimensional array
    flattened = []
    for sublist in data:
        for item in sublist:
            flattened.append(item)

    # Separate the elements
    x = [item[0] for item in flattened]
    y = [item[1] for item in flattened]

    c = Counter(x)
    most_common_x = c.most_common(1)[0][0]
    c = Counter(y)
    most_common_y = c.most_common(1)[0][0]

    return most_common_x, most_common_y

# Demo function
def main():
    data = [[(2, 0), (0, 3), (0, 2), (0, 3), (2, 4), (0, 3), (0, 3), (2, 7)],
            [(2, 0), (0, 1), (2, 0), (0, 1), (3, 4), (2, 7), (2, 0), (2, 7)],
            [(2, 2), (2, 3), (2, 2), (2, 3), (2, 2), (2, 3), (2, 3), (2, 2)],
            [(2, 1), (2, 1), (3, 2), (2, 1), (2, 1), (3, 3), (2, 1), (2, 1)]]

    most_common_x, most_common_y = find_most_common_values(data)
    print("Most commont X: " + str(most_common_x))
    print("Most commont Y: " + str(most_common_y))



# Main entry point
if __name__ == "__main__":
    main()

Which correctly outputs the following:

Most commont X: 2
Most commont Y: 3

Since I'm going to utilize this in a for loop with a lot of data I'm trying to implement the fastest approach and since I'm a newbie in Python I guess there are better ways I'm not aware of, so anyone know a faster approach preferably more Pythonic?

Here's a one-liner to achieve this using collections.Counter along with zip and itertools.chain in list comprehension :

from collections import Counter
from itertools import chain

a, b = [Counter(x).most_common(1)[0][0] for x in zip(*chain(*data))]

Output:

>>> a
2
>>> b
3

You can refer below documents to read more about these functions:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM