简体   繁体   中英

Merging 2d arrays

Suppose I have two arrays:

arrayOne = [["james", 35], ["michael", 28], ["steven", 23], 
            ["jack", 18], ["robert", 12]]
arrayTwo = [["charles", 45], ["james",  36], ["trevor", 24], 
            ["michael", 17], ["steven", 4]]

I want to merge them, so that I would have a single 2D array, where the first element of each inner array is the name (james, charles, etc). The second element of the inner array is its respective value in arrayOne , and if does not have a respective value it would be 0. Conversely for the third element. The order does not really matter as long as the numbers match the name. In other words I would get something like this

arrayResult = [["james", 35, 36], ["michael", 28, 17], ["steven", 23, 4],
               ["jack", 18, 0], ["robert", 12, 0], ["charles", 0, 45],
               ["trevor", 0, 4]]

Also, I am trying to have it so that I could add more "columns" to this array result if I were to give another array.

>>> dict1 = dict(arrayOne)
>>> dict2 = dict(arrayTwo)
>>> keyset = set(dict1.keys() + dict2.keys())
>>> [[key, dict1.get(key, 0), dict2.get(key, 0)] for key in keyset]
[['james', 35, 36], ['robert', 12, 0], ['charles', 0, 45], 
 ['michael', 28, 17], ['trevor', 0, 24], ['jack', 18, 0], 
 ['steven', 23, 4]]

This gets a bit more complicated if you want to add multiple columns; a dictionary is then best. But having 0 s in the right places becomes a challenge, because when we add a name to the "master dictionary", we have to make sure it starts with a list of 0 s of the right length. I'm tempted to create a new class for this, but first, here's a basic function-based solution:

def add_column(masterdict, arr):
    mdlen = len(masterdict[masterdict.keys()[0]])
    newdict = dict(arr)
    keyset = set(masterdict.keys() + newdict.keys())
    for key in keyset:
        if key not in masterdict:
            masterdict[key] = [0] * mdlen
        masterdict[key].append(newdict.get(key, 0))

arrayOne =   [["james", 35],
              ["michael", 28],
              ["steven", 23],
              ["jack", 18],
              ["robert", 12]]
arrayTwo =   [["charles", 45],
              ["james",  36],
              ["trevor", 24],
              ["michael", 17],
              ["steven", 4]]
arrayThree = [["olliver", 11],
              ["james",  39],
              ["john", 22],
              ["michael", 13],
              ["steven", 6]]

masterdict = dict([(i[0], [i[1]]) for i in arrayOne])

add_column(masterdict, arrayTwo)
print masterdict
add_column(masterdict, arrayThree)
print masterdict

Output:

{'james': [35, 36], 'robert': [12, 0], 'charles': [0, 45], 
 'michael': [28, 17], 'trevor': [0, 24], 'jack': [18, 0], 
 'steven': [23, 4]}
{'james': [35, 36, 39], 'robert': [12, 0, 0], 'charles': [0, 45, 0], 
  'michael': [28, 17, 13], 'trevor': [0, 24, 0], 'olliver': [0, 0, 11], 
  'jack': [18, 0, 0], 'steven': [23, 4, 6], 'john': [0, 0, 22]}

It looks like what you really need is dictionaries, rather than arrays. If you use a dictionary, this problem becomes a whole lot easier. Converting to dicts couldn't be easier:

dictOne = dict(arrayOne)
dictTwo = dict(arrayTwo)

From there, you can put them together like this:

combined = dict()
for name in set(dictOne.keys() + dictTwo.keys()):
  combined[name] = [ dictOne.get(name, 0), dictTwo.get(name, 0) ]

What this does is create a new dictionary called combined , which we'll put the final data in. Then, we make a set of keys from both original dictionaries. Using a set ensures we don't do anything twice. Finally, we loop through this set of keys and add each pair of values to the combined dictionary, telling calls to the .get method to supply 0 if no value is present. If you need to switch the combined dictionary back to an array, that's pretty easy too:

arrayResult = []
for name in combined:
  arrayResult.append([ name ] + combined[name])

Supposing you want to add another column to your result dictionary, all you have to do is change the middle code to look like this:

combined = dict()
for name in set(dictOne.keys() + dictTwo.keys() + dictThree.keys()):
  combined[name] = [ dictOne.get(name, 0), dictTwo.get(name, 0), dictThree.get(name, 0) ]

If you wanted to encapsulate all this logic in a function (which is something I would recommend), you could do it like this:

def combine(*args):
  # Create a list of dictionaries from the arrays we passed in, since we are
  # going to use dictionaries to solve the problem.
  dicts = [ dict(a) for a in args ]

  # Create a list of names by looping through all dictionaries, and through all
  # the names in each dictionary, adding to a master list of names
  names = []
  for d in dicts:
    for name in d.keys():
      names.append(name)

  # Remove duplicates in our list of names by making it a set
  names = set(names)

  # Create a result dict to store results in
  result = dict()

  # Loop through all the names, and add a row for each name, pulling data from
  # each dict we created in the beginning
  for name in names:
    result[name] = [ d.get(name, 0) for d in dicts ]

  # Return, secure in the knowledge of a job well done. :-)
  return result

# Use the function:
resultDict = combine(arrayOne, arrayTwo, arrayThree)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM