合并排序二维数组

Question

I'm stuck again on trying to make this merge sort work. 我再次陷入尝试使这种合并排序工作。 Currently, I have a 2d array with a Unix timecode(fig 1) and merge sorting using (fig 2) I am trying to check the first value in each array ie array[x][0] and then move the whole array depending on array[x][0] value, however, the merge sort creates duplicates of data and deletes other data (fig 3) my question is what am I doing wrong? 目前，我有一个带有Unix时间码的2d数组（图1），并使用（图2）合并排序。我试图检查每个数组中的第一个值，即array [x] [0]，然后根据array [x] [0]值，但是，合并排序会创建重复数据并删除其他数据（图3）。我的问题是我在做什么错？ I know it's the merge sort but cant see the fix. 我知道这是合并排序，但看不到解决方法。

fig 1 图。1

[[1422403200        100]
 [1462834800        150]
 [1458000000         25]
 [1540681200        150]
 [1498863600        300]
 [1540771200        100]
 [1540771200        100]
 [1540771200        100]
 [1540771200        100]
 [1540771200        100]]

fig 2 图2

import numpy as np


def sort(data):
    if len(data) > 1:
        Mid = len(data) // 2
        l = data[:Mid]
        r = data[Mid:]
        sort(l)
        sort(r)

        z = 0
        x = 0
        c = 0

        while z < len(l) and x < len(r):
            if l[z][0] < r[x][0]:
                data[c] = l[z]
                z += 1
            else:
                data[c] = r[x]
                x += 1
            c += 1

        while z < len(l):
            data[c] = l[z]
            z += 1
            c += 1

        while x < len(r):
            data[c] = r[x]
            x += 1
            c += 1
        print(data, 'done')
unixdate = [1422403200, 1462834800, 1458000000, 1540681200, 1498863600, 1540771200, 1540771200,1540771200, 1540771200, 1540771200]
price=[100, 150, 25, 150, 300, 100, 100, 100, 100, 100]
array = np.column_stack((unixdate, price))
sort(array)
print(array, 'sorted')

fig 3 图3

[[1422403200        100]
 [1458000000         25]
 [1458000000         25]
 [1498863600        300]
 [1498863600        300]
 [1540771200        100]
 [1540771200        100]
 [1540771200        100]
 [1540771200        100]
 [1540771200        100]]

Answer 1

I couldn't spot any mistake in your code. 我无法在您的代码中发现任何错误。

I have tried your code and I can tell that the problem does not happen, at least with regular Python lists: The function doesn't change the number of occurrence of any element in the list. 我已经尝试过您的代码，并且至少在常规的Python列表中，我可以告诉您该问题不会发生：该函数不会更改列表中任何元素的出现次数。

data = [
 [1422403200, 100],
 [1462834800, 150],
 [1458000000,  25],
 [1540681200, 150],
 [1498863600, 300],
 [1540771200, 100],
 [1540771200, 100],
 [1540771200, 100],
 [1540771200, 100],
 [1540771200, 100],
]

sort(data)

from pprint import pprint
pprint(data)

Output: 输出：

[[1422403200, 100],
 [1458000000, 25],
 [1462834800, 150],
 [1498863600, 300],
 [1540681200, 150],
 [1540771200, 100],
 [1540771200, 100],
 [1540771200, 100],
 [1540771200, 100],
 [1540771200, 100]]

Edit , taking into account the numpy context and the use of np.column_stack . 编辑时 ，要考虑到numpy上下文和 np.column_stack的使用。

~~-I expect what happens there is that np.column_stack actually creates a view mapping over the two arrays.~~ ~~-我希望发生的事情是 np.column_stack实际上在两个数组上创建了一个视图映射。~~ ~~To get a real array rather than a link to your existing arrays, you should copy that array:-~~ ~~要获得真实的数组而不是指向现有数组的链接，您应该复制该数组：~~

 
 
 
  
  array = np.column_stack((unixdate, price)).copy()

Edit 2 , taking into account the numpy context 编辑2 ，考虑到numpy上下文

This behavior has actually nothing to do with np.column_stack ; 这种行为实际上与np.column_stack ； np.column_stack already performs a copy. np.column_stack已执行复制。

The reason your code doesn't work is because slicing behaves differently with numpy than with python. 您的代码不起作用的原因是因为numpy的切片行为与python不同。 Slicing create a view of the array which maps indexes. 切片创建映射索引的数组视图。

The erroneous lines are: 错误的行是：

 l = data[:Mid] r = data[Mid:]

Since l and r just map to two pieces of the memory held by data , they are modified when data is. 由于l和r只是映射到data保存的两个内存中，因此在data为true时会对其进行修改。 This is why the lines data[c] = l[z] and data[c] = r[x] overwrite values and create copies when moving values. 这就是为什么data[c] = l[z]和data[c] = r[x]覆盖值并在移动值时创建副本的原因。

If data is a numpy array, we want l and r to be copies of data, not just views. 如果data是一个numpy数组，我们希望l和r是数据的副本，而不仅仅是视图。 This can be achieved using the copy method. 这可以使用copy方法来实现。

 l = data[:Mid] r = data[Mid:] if isinstance(data, np.ndarray): l = l.copy() r = r.copy()

This way, I tested, the copy works. 通过这种方式，我测试了复制的效果。

Note 注意

If you wanted to sort the data using python lists rather than numpy arrays, the equivalent of np.column_stack in vanilla python is zip : 如果您想使用python列表而不是numpy数组对数据进行排序，那么在香草python中，np.column_stack的等效项是zip ：

 z = zip([10, 20, 30, 40], [100, 200, 300, 400], [1000, 2000, 3000, 4000]) z # <zip at 0x7f6ef80ce8c8> # `zip` creates an iterator, which is ready to give us our entries. # Iterators can only be walked once, which is not the case of lists. list(z) # [(10, 100, 1000), (20, 200, 2000), (30, 300, 3000), (40, 400, 4000)]

The entries are (non-mutable) tuples. 这些条目是（非可变的）元组。 If you need the entries to be editable, map list on them: 如果您需要条目可编辑，请在其上列出地图：

 z = zip([10, 20, 30, 40], [100, 200, 300, 400], [1000, 2000, 3000, 4000]) li = list(map(list, z)) # [[10, 100, 1000], [20, 200, 2000], [30, 300, 3000], [40, 400, 4000]]

To transpose a matrix, use zip(*matrix) : 要转置矩阵，请使用zip(*matrix) ：

 def transpose(matrix): return list(map(list, zip(*matrix))) transpose(l) # [[10, 20, 30, 40], [100, 200, 300, 400], [1000, 2000, 3000, 4000]]

You can also sort a python list li using li.sort() , or sort any iterator (lists are iterators), using sorted(li) . 您还可以使用li.sort()对python列表li进行排序，或者使用sorted(li)对任何迭代器（列表为迭代器）进行sorted(li) 。

Here, I would use (tested): 在这里，我将使用（经过测试）：

 sorted(zip(unixdate, price))

合并排序二维数组

问题描述

1 个解决方案

解决方案1
-1 已采纳 2018-11-16 22:40:52

合并排序二维数组

问题描述

1 个解决方案

解决方案1 -1 已采纳 2018-11-16 22:40:52

解决方案1
-1 已采纳 2018-11-16 22:40:52