简体   繁体   English

合并2d数组

[英]Merging 2d arrays

Suppose I have two arrays: 假设我有两个数组:

arrayOne = [["james", 35], ["michael", 28], ["steven", 23], 
            ["jack", 18], ["robert", 12]]
arrayTwo = [["charles", 45], ["james",  36], ["trevor", 24], 
            ["michael", 17], ["steven", 4]]

I want to merge them, so that I would have a single 2D array, where the first element of each inner array is the name (james, charles, etc). 我想合并它们,以便我有一个2D数组,其中每个内部数组的第一个元素是名称(james,charles等)。 The second element of the inner array is its respective value in arrayOne , and if does not have a respective value it would be 0. Conversely for the third element. 内部数组的第二个元素是它在arrayOne相应值,如果没有相应的值,则它将为0.相反,对于第三个元素。 The order does not really matter as long as the numbers match the name. 只要数字与名称匹配,订单就不重要了。 In other words I would get something like this 换句话说,我会得到这样的东西

arrayResult = [["james", 35, 36], ["michael", 28, 17], ["steven", 23, 4],
               ["jack", 18, 0], ["robert", 12, 0], ["charles", 0, 45],
               ["trevor", 0, 4]]

Also, I am trying to have it so that I could add more "columns" to this array result if I were to give another array. 此外,我试图让它,以便我可以添加更多的“列”到这个数组结果,如果我要给另一个数组。

>>> dict1 = dict(arrayOne)
>>> dict2 = dict(arrayTwo)
>>> keyset = set(dict1.keys() + dict2.keys())
>>> [[key, dict1.get(key, 0), dict2.get(key, 0)] for key in keyset]
[['james', 35, 36], ['robert', 12, 0], ['charles', 0, 45], 
 ['michael', 28, 17], ['trevor', 0, 24], ['jack', 18, 0], 
 ['steven', 23, 4]]

This gets a bit more complicated if you want to add multiple columns; 如果要添加多个列,这会变得有点复杂; a dictionary is then best. 一本字典是最好的。 But having 0 s in the right places becomes a challenge, because when we add a name to the "master dictionary", we have to make sure it starts with a list of 0 s of the right length. 但是在正确的位置有0秒会成为一个挑战,因为当我们在“主词典”中添加一个名称时,我们必须确保它以正确长度的0列表开头。 I'm tempted to create a new class for this, but first, here's a basic function-based solution: 我很想为此创建一个新类,但首先,这是一个基于函数的基本解决方案:

def add_column(masterdict, arr):
    mdlen = len(masterdict[masterdict.keys()[0]])
    newdict = dict(arr)
    keyset = set(masterdict.keys() + newdict.keys())
    for key in keyset:
        if key not in masterdict:
            masterdict[key] = [0] * mdlen
        masterdict[key].append(newdict.get(key, 0))

arrayOne =   [["james", 35],
              ["michael", 28],
              ["steven", 23],
              ["jack", 18],
              ["robert", 12]]
arrayTwo =   [["charles", 45],
              ["james",  36],
              ["trevor", 24],
              ["michael", 17],
              ["steven", 4]]
arrayThree = [["olliver", 11],
              ["james",  39],
              ["john", 22],
              ["michael", 13],
              ["steven", 6]]

masterdict = dict([(i[0], [i[1]]) for i in arrayOne])

add_column(masterdict, arrayTwo)
print masterdict
add_column(masterdict, arrayThree)
print masterdict

Output: 输出:

{'james': [35, 36], 'robert': [12, 0], 'charles': [0, 45], 
 'michael': [28, 17], 'trevor': [0, 24], 'jack': [18, 0], 
 'steven': [23, 4]}
{'james': [35, 36, 39], 'robert': [12, 0, 0], 'charles': [0, 45, 0], 
  'michael': [28, 17, 13], 'trevor': [0, 24, 0], 'olliver': [0, 0, 11], 
  'jack': [18, 0, 0], 'steven': [23, 4, 6], 'john': [0, 0, 22]}

It looks like what you really need is dictionaries, rather than arrays. 看起来你真正需要的是字典,而不是数组。 If you use a dictionary, this problem becomes a whole lot easier. 如果使用字典,这个问题就变得容易多了。 Converting to dicts couldn't be easier: 转换为dicts并非易事:

dictOne = dict(arrayOne)
dictTwo = dict(arrayTwo)

From there, you can put them together like this: 从那里,你可以像这样把它们放在一起:

combined = dict()
for name in set(dictOne.keys() + dictTwo.keys()):
  combined[name] = [ dictOne.get(name, 0), dictTwo.get(name, 0) ]

What this does is create a new dictionary called combined , which we'll put the final data in. Then, we make a set of keys from both original dictionaries. 这样做是创建一个名为combined的新词典,我们将把最终数据放入其中。然后,我们从两个原始词典中创建一组键。 Using a set ensures we don't do anything twice. 使用集合确保我们不做任何两次。 Finally, we loop through this set of keys and add each pair of values to the combined dictionary, telling calls to the .get method to supply 0 if no value is present. 最后,我们遍历这组键并将每对值添加到combined字典中,如果没有值,则告诉调用.get方法提供0 If you need to switch the combined dictionary back to an array, that's pretty easy too: 如果你需要将组合字典切换回数组,那也很容易:

arrayResult = []
for name in combined:
  arrayResult.append([ name ] + combined[name])

Supposing you want to add another column to your result dictionary, all you have to do is change the middle code to look like this: 假设您要在结果字典中添加另一列,您只需将中间代码更改为如下所示:

combined = dict()
for name in set(dictOne.keys() + dictTwo.keys() + dictThree.keys()):
  combined[name] = [ dictOne.get(name, 0), dictTwo.get(name, 0), dictThree.get(name, 0) ]

If you wanted to encapsulate all this logic in a function (which is something I would recommend), you could do it like this: 如果你想将所有这些逻辑封装在一个函数中(这是我推荐的),你可以这样做:

def combine(*args):
  # Create a list of dictionaries from the arrays we passed in, since we are
  # going to use dictionaries to solve the problem.
  dicts = [ dict(a) for a in args ]

  # Create a list of names by looping through all dictionaries, and through all
  # the names in each dictionary, adding to a master list of names
  names = []
  for d in dicts:
    for name in d.keys():
      names.append(name)

  # Remove duplicates in our list of names by making it a set
  names = set(names)

  # Create a result dict to store results in
  result = dict()

  # Loop through all the names, and add a row for each name, pulling data from
  # each dict we created in the beginning
  for name in names:
    result[name] = [ d.get(name, 0) for d in dicts ]

  # Return, secure in the knowledge of a job well done. :-)
  return result

# Use the function:
resultDict = combine(arrayOne, arrayTwo, arrayThree)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM