I have a list of lists:
a = [[0, 1], [0, 2], [0, 26], [0, 74], [1, 77], [1, 80], [1, 81], [2, 117], [2, 118], [2, 119], [2, 120]]
How can I combine all lists in the list with the same first element
Desired output:
a = [[0, 1, 2, 26, 74], [1, 77, 80, 81], [2, 117, 118, 119, 120]]
Try this:
d = {}
for key, value in a:
if key not in d.keys():
d[key] = [key]
d[key].append(value)
result = list(d.values())
from collections import defaultdict tmp = defaultdict(list) for key, val in a: tmp[key].append(val) print([[key] + val for key, val in tmp.items()])
I'll do it this way.
Here I assume that input is a list of sublist 2 lenght long.
def merge_list(input):
res = [] # Final list
a = [] # Just make a list of the first element of each list
for i in input:
if i[0] not in a:
a.append(i[0])
for i in a:
b = [i]
for j in input:
if j[0] == i:
# If you want input like [[1, 2, 3], [1, 4, 6]..]
# Copy with a for excluding the first element instead of this j[1]
b.append(j[1])
res.append(b)
print(res)
I think the other answers here are specific to two item lists. Here's one that works with any number of items in your sublists (as long as there's at least one):
a = [[0, 1], [0, 2], [0, 26], [0, 74], [1, 77], [1, 80], [1, 81], [2, 117], [2, 118], [2, 119], [2, 120]]
output_dict = {}
for key, *values in a:
if key not in output_dict:
output_dict[key] = [key]
output_dict[key].extend(values)
Now the results are in output_dict.values()
.
Since this question has a numpy tag I'll extend about possible ways to solve it in numpy
. In general, this is called a group by problem . There are many ways you can do this in numpy
. You can classify them into two categories:
np.unique
np.bincount
The second type of solutions won't work in general if IDs of groups are large but this is a significant boost of np.unique
in case IDS are small.
You need to sort your data by the first column before you apply any kind of these methods:
a = np.array(a)
arr = a[a[:, 0].argsort()]
Then you can choose your method of grouping and a custom return:
def _custom_return(unique_id, a, split_idx, return_groups):
'''Choose if you want to also return unique ids'''
if return_groups:
return unique_id, np.split(a[:,1], split_idx)
else:
return np.split(a[:,1], split_idx)
def numpy_groupby_index(a, return_groups=True):
'''Code refactor of method of Vincent J'''
u, idx = np.unique(a[:,0], return_index=True)
return _custom_return(u, a, idx[1:], return_groups)
def numpy_groupby_bins(a, return_groups=True):
'''Significant boost of np.unique by np.bincount'''
bins = np.bincount(a[:,0])
nonzero_bins_idx = bins != 0
nonzero_bins = bins[nonzero_bins_idx]
idx = np.cumsum(nonzero_bins[:-1])
return _custom_return(np.flatnonzero(nonzero_bins_idx), a, idx, return_groups)
numpy_groupby_bins(arr, return_groups=True)
>>> (array([0, 1, 2]),
[array([ 1, 2, 26, 74]), array([77, 80, 81]), array([117, 118, 119, 120])])
numpy_groupby_bins(arr, return_groups=False)
>>> [array([ 1, 2, 26, 74]), array([77, 80, 81]), array([117, 118, 119, 120])]
numpy_groupby_index(arr, return_groups=True)
>>> (array([0, 1, 2]),
[array([ 1, 2, 26, 74]), array([77, 80, 81]), array([117, 118, 119, 120])])
numpy_groupby_index(arr, return_groups=False)
>>> [array([ 1, 2, 26, 74]), array([77, 80, 81]), array([117, 118, 119, 120])]
Note that all the methods contain np.split
method which is based on list.append
under the hood and hence it is not efficient in case you've got a big bunch of small groups. This happens because numpy is not designed to work with arrays of different lengths.
Also note that the output you expect requires one more iteration:
groups = numpy_groupby_index(arr, return_groups=True)
out = [np.r_[key, group] for key, group in zip(*groups)]
out
>>> [array([ 0, 1, 2, 26, 74]),
array([ 1, 77, 80, 81]),
array([ 2, 117, 118, 119, 120])]
If you're interested in performant solutions of this problem you could also read my further analysis on this kind of problem
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.