简体   繁体   中英

How to group elements of a list in python based on their distance from other elements?

I have a list of numbers. I want to group the numbers which lie within a distance of 4 from each other. For example if I have a list [1,34,45,34,66,35,14,3,5,12,4,62,31,4,13,12] . I want to group the elements in the fashion "1 3 5 4 4 ;34 34 35 31 ;45 ;66 62 ;14 12 13 12 ;" .To make it clear:

Input >> [1,34,45,34,66,35,14,3,5,12,4,62,31,4,13,12]

Output >> 1 3 5 4 4 ;34 34 35 31 ;45 ;66 62 ;14 12 13 12 ;

For this, I have written the following code:

arr = [1,34,45,34,66,35,14,3,5,12,4,62,31,4,13,12]
bucket = ''
while arr != []:
    element = arr.pop(0)
    bucket += (str(element) + ' ')
    for term in arr:
        if abs(element-term)<= 4:
            bucket += (str(term) + ' ')
            arr.remove(term)
            print(bucket)
            print(arr)
    else:
        bucket += ';'
print(arr)
print(bucket)

I expected the final output to be as follows:

1 3 5 4 4 ;34 34 35 31 ;45 ;66 62 ;14 12 13 12 ;

But what I got in the final output was:

1 3 4 4 ;34 34 35 31 ;45 ;66 62 ;14 12 12 ;5 ;13 ;

here the element '5' should have been in the first bucket but in the output it is not in the bucket where it's supposed to be. Similarly '13' is out of its place

完整的输出

Any help in identifying the problem in the code will be greatly appreciated.

问题是您没有考虑在这些术语之前删除元素,例如当您删除 3 term 时位于第 7 个位置 (term=6) arr = [34,45,34,66,35,14,5,12, 4,62,31,4,13,12] ^5 变为第 7 位,但 term 继续递增,因此它跳过 5 尝试在退出 for 循环之前使 term=term-1

By the time you come to the elements 5 and 13 in your list, the other elements that are in range of 4 have all been removed already and put into buckets, that's why you get new buckets for the two remaining.

Maybe it would be better to use lists for the buckets to check for each new element whether it is in the range of 4 to every other element in the bucket and then adding it. That way, you avoid having not enough elements in your original list.

You're removing elements from the list while you're iterating it, which leads to some odd skipping behavior. You may notice that each time you're doing arr.remove(term) , you're skipping the next element. The 5 got skipped because there was a 3 right before it. Similarly, at the point when 13 should be assigned to the 5th group, it got skipped because it was preceded by a 12 (after the 4,62,31,4 between them had already been removed).

There is some explanation here: Strange result when removing item from a list while iterating over it [duplicate] and here: How to remove items from a list while iterating? .

I believe this should achieve your desired result (although you would still need to format the output):

def group_numbers(numbers, max_difference=4):
groups = []
for number in numbers:
    found_group = False
    for group in groups:
        for member in group:
            if abs(member - number) <= max_difference:
                group.append(number)
                found_group = True
                break

            # remove this if-block if a number should be added to multiple groups
            if found_group:
                break
    if not found_group:
        groups.append([number])
return groups


print(group_numbers([1,34,45,34,66,35,14,3,5,12,4,62,31,4,13,12]))

output: [[1, 3, 5, 4, 4], [34, 34, 35, 31], [45], [66, 62], [14, 12, 13, 12]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM