Let's say I have an array like this
grid:
[[1, 1, 0, 0, 0],
[1, 1, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 1, 1]]
I want to isolate the group of "items" in this case 1's which are three groups the rule being the 0's are used to separate them like intersections. So this example has 3 groups of 1.
If you know how to do this with python, the first question I'd be asked is what I've tried as proof of not handing my homework to the community, the idea I had was to iterate down and left but that would have a high likelihood of missing some numbers since if you think about it, it would form a cross eminating from the top left and well this group is here to learn. So for me and others who have an interest in this data science like problem be considerate.
If you do not need to know which sets are duplicates, you can use python's set
built-in to determine unique items in a list. This can be a bit tricky since set
doesn't work on a list
of list
s. However, you can convert this to a list
of tuple
s, put those back in a list
, and then get the len
of that list to find out how many unique value sets there are.
grid = [[1, 1, 0, 0, 0],
[1, 1, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 1, 1]]
unique = [list(x) for x in set(tuple(x) for x in grid)]
unique_count = len(unique) # this will return 3
Relatively straightforward depth first search based implementation of connected component labeling .
def get_components(grid, indicator=1):
def label(g, row, col, group):
if row >= 0 and col >= 0 and row < len(g) and col < len(g[row]) and g[row][col] == -1:
# only label if currently unlabeled
g[row][col] = group
# attempt to label neighbors with same label
label(g, row + 1, col, group)
label(g, row, col + 1, group)
label(g, row - 1, col, group)
label(g, row, col - 1, group)
return True
else:
return False
# initialize label grid as -1 for entries that need labeled
label_grid = [[-1 if gc == indicator else 0 for gc in gr] for gr in grid]
group_count = 0
for row, grid_row in enumerate(grid):
for col in range(len(grid_row)):
if label(label_grid, row, col, group_count + 1):
group_count += 1
return label_grid, group_count
The results of label_grid, group_count = get_components(grid)
for your example inputs are
label_grid = [[1, 1, 0, 0, 0],
[1, 1, 0, 0, 0],
[0, 0, 2, 0, 0],
[0, 0, 0, 3, 3]]
group_count = 3
And for cases like the following
grid = [[1 0 1],
[1 1 1]]
we get group_count = 1
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.