简体   繁体   中英

Using Python to find matching arrays and combining into one array

I would like to use Python to find the matching arrays such as [4012630, 0.07575758] and [4012630, 0.5671642] . Then I would like to combine them into 1 array and add the decimals. So it would become [4012630, 0.64292178] .

Goal is to convert this array:

[[4012630, 0.07575758], 
[4012618, 0.014925373], 
[4012630, 0.5671642], 
[4012624, 0.029850746], 
[4012628, 0.41791046], 
[4012624, 0.07462686], 
[4012628, 0.04477612], 
[4012636, 0.2820513]]

Into this array:

[[4012630, 0.64292178],
[4012618, 0.014925373],
[4012624, 0.104477606],
[4012628, 0.46268658,
[4012636, 0.2820513]]

There are many ways to solve this problem, here is my O(n) Time complexity solution:

def unique_sum(input_list):
  index_mapping = {}
  final_list = []
  for i in range(0, len(input_list)):
    if input_list[i][0] in index_mapping: 
      index = index_mapping[input_list[i][0]]
      final_list[index][1] += input_list[i][1]
    else:
      index_mapping[input_list[i][0]] = len(final_list)
      final_list.append([input_list[i][0], input_list[i][1]])
  return final_list

then you can call this function like:

data = [[4012630, 0.07575758], 
[4012618, 0.014925373], 
[4012630, 0.5671642], 
[4012624, 0.029850746], 
[4012628, 0.41791046], 
[4012624, 0.07462686], 
[4012628, 0.04477612], 
[4012636, 0.2820513]]


print(unique_sum(data))

I propose the following solution. First, find out the unique labels. Then create the result list and initialize the sum to be 0. Then, iterate every element in the input list and add the value to the corresponding bin.

lst = [[4012630, 0.07575758],
[4012618, 0.014925373],
[4012630, 0.5671642],
[4012624, 0.029850746],
[4012628, 0.41791046],
[4012624, 0.07462686],
[4012628, 0.04477612],
[4012636, 0.2820513]]

def sumlist(lst):
    unique = list(set([x[0] for x in lst]))
    result = [[x,0] for x in unique]
    for i, value in lst:
        ind = unique.index(i)
        result[ind][1] += value
    return result

You can use the Dictionary Data Structure for the following task.

Here's how I used it-

data = [[4012630, 0.07575758],
[4012618, 0.014925373],
[4012630, 0.5671642],
[4012624, 0.029850746],
[4012628, 0.41791046],
[4012624, 0.07462686],
[4012628, 0.04477612],
[4012636, 0.2820513]]

dict={}    #Declaring an empty dictionary.
for nat,dec in data:
    if nat not in dict:
        dic[nat]=dec
    else:
        dic[nat]+=dec
print(dict)

The output of the code is-

{4012630: 0.64292178, 
4012618: 0.014925373, 
4012624: 0.104477606, 
4012628: 0.46268657999999996, 
4012636: 0.2820513}

The only catch here is that the output will be a Dictionary instead of an array/list. But you can easily convert the dictionary into an array if required.

from collections import defaultdict
def sum_func(l:list):
    temp1 = defaultdict(list)
    result = []
    for el in l:
        temp1[el[0]].append(el[1])
    for k, v in temp1.items():
        result.append([k, sum(v)])
    return result

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM