简体   繁体   English

如何找到两个排序数组之间的中位数?

[英]How to find the median between two sorted arrays?

I'm working on a competitive programming problem where we're trying to find the median of two sorted arrays. 我正在研究一个竞争性编程问题,我们试图找到两个排序数组的中位数。 The optimal algorithm is to perform a binary search and identify splitting points, i and j , between the two arrays. 最佳算法是执行二进制搜索并标识两个数组之间的分割点ij

I'm having trouble deriving the solution myself. 我自己导出解决方案时遇到了麻烦。 I don't understand the initial logic. 我不了解最初的逻辑。 I will follow how I think of the problem so far. 到目前为止,我将遵循我对问题的看法。

The concept of the median is to partition the given array into two sets. 中位数的概念是将给定数组分成两组。 Consider a hypothetical left array and a hypothetical right array after merging the two given arrays. 在合并两个给定数组后,考虑一个假设的left数组和一个假设的right数组。 Both these arrays are of the same length. 这两个数组的长度相同。

We know that the median given both those hypothetical arrays works out to be [max(left) + min(right)]/2 . 我们知道,给定这两个假设数组的中位数为[max(left) + min(right)]/2 This makes sense so far. 到目前为止,这是有道理的。 But the issue here is now knowing how to construct the left and right arrays. 但这里的问题是,现在知道如何构建leftright阵列。

We can choose a splitting point on ArrayA as i and a splitting point on ArrayB as j . 我们可以选择在分流点ArrayAi和分流点ArrayBj Note that len(ArrayB[:j] + ArrayB[:i]) == len(ArrayB[j:] +ArrayB[i:]) . 请注意, len(ArrayB[:j] + ArrayB[:i]) == len(ArrayB[j:] +ArrayB[i:])

Now we just need to find the cutting points. 现在我们只需要找到切入点即可。 We could try all splitting points i , j such that they satisfy the median condition. 我们可以尝试所有分裂点ij ,使其满足中值条件。 However this works out to be O(m*n) where M is size of ArrayB and where N is size of ArrayA . 但是,这是O(m*n) where M is size of ArrayB and where N is size of ArrayA

I'm not sure how to get where I am to the binary search solution using my train of thought. 我不确定如何运用我的思路去二进制搜索解决方案。 If someone could give me pointers - that would be awesome. 如果有人可以给我指点,那就太好了。

Here is my approach that I managed to come up with. 这是我设法提出的方法。

First of all we know that the resulting array will contain N+M elements, meaning that the left part will contain (N+M)/2 elements, and the right part will contain (N+M)/2 elements as well. 首先,我们知道结果数组将包含N + M个元素,这意味着左侧部分将包含(N + M)/ 2个元素,而右侧部分也将包含(N + M)/ 2个元素。 Let's denote the resulting array as Ans , and denote the size of one of its parts as PartSize . 让我们将结果数组表示为Ans ,并将其一部分之一的大小表示为PartSize

Perform a binary search operation on array A . 在数组A上执行二进制搜索操作。 The range of such binary search will be [ 0 , N ]. 这样的二进制搜索的范围将是[ 0N ]。 This binary search operation will help you determine the number of elements from array A that will form the left part of the resulting array. 此二进制搜索操作将帮助您确定数组A中构成结果数组左侧部分的元素数。

Now, suppose we are testing the value i . 现在,假设我们正在测试值i If i elements from array A are supposed to be included in the left part of the resulting array, this means that j = PartSize - i elements must be included from array B in the first part as well. 如果应该将数组A中的i个元素包括在结果数组的左侧,则意味着j = PartSize-第i部分中也必须包含数组B中的i个元素。 We have the following possibilities: 我们有以下可能性:

  • j > M this is an invalid state. j> M,这是无效状态。 In this case it means we still need to choose more elements from array A , so our new binary search range becomes [ i + 1 , N ]. 在这种情况下,这意味着我们仍然需要从数组A中选择更多元素,因此我们新的二进制搜索范围变为[ i +1N ]。

  • j <= M & A[i+1] < B[j] This is a tricky case. j <= MA [i + 1] <B [j]这是一个棘手的情况。 Think about it. 想一想。 If the next element in array A is smaller than the element j in array B , this means that element A[i+1] is supposed to be in the left part rather than element B[j] . 如果数组A中的下一个元素小于数组B中的元素j ,则意味着元素A [i + 1]应该位于左侧,而不是元素B [j] In this case our new binary search range becomes [ i+1 , N ]. 在这种情况下,我们的新二进制搜索范围变为[ i + 1N ]。

  • j <= M & A[i] > B[j+1] This is close to the previous case. j <= MA [i]> B [j + 1]这与前面的情况很接近。 If the next element in array B is smaller than the element i in array A , the means that element B[j+1] is supposed to be in the left part rather than element A[i] . 如果数组B中的下一个元素小于数组A中的元素i ,则意味着元素B [j + 1]应当位于元素的左侧,而不是元素A [i] In this case our new binary search range becomes [ 0 , i-1 ]. 在这种情况下,我们新的二进制搜索范围变为[ 0i-1 ]。

  • j <= M & A[i+1] >= B[j] & A[i] <= B[j+1] this is the optimal case, and you have finally found your answer. j <= MA [i + 1]> = B [j]A [i] <= B [j + 1]这是最佳情况,您终于找到了答案。

After the binary search operation is finished, and you managed to calculate both i and j , you can now easily find the value of the median. 二元搜索操作完成后,您成功计算了ij ,现在可以轻松地找到中值。 You need to handle a few cases here depending on whether N+M is odd or even. 您需要根据N + M是奇数还是偶数来处理一些情况。

Hope it helps! 希望能帮助到你!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何找到两个日期之间的中位数? - How to find the median month between two dates? 如何找到两个 arrays 之间的相似性百分比 - how to find percentage of similarity between two arrays 找到两个排序数组的中位数。 是否可以消除一些不平等检查? - Finding median of two sorted arrays. Can some inequality checks be eliminated? 为什么在计算两个已排序 arrays 的中位数时,较小的数组驱动二进制搜索? - Why have the smaller array drive binary search when computing the median of two sorted arrays? Numpy:如何最好地对齐两个排序的数组? - Numpy: How to best align two sorted arrays? numpy:如何将两个排序数组合并为一个更大的排序数组? - numpy: How to merge two sorted arrays into a larger sorted array? 计算两个排序列表的中位数的相对分区 - Calculating opposite partition in median of two sorted lists 两个排序的部分重叠的numpy数组之间的索引映射 - Index mapping between two sorted partially overlapping numpy arrays 如何通过在两个 arrays 之间交替来找到最长的交替递增子序列 - How to find the longest alternating increasing subsequence by alternating between two arrays 如何在两个相当大的 2D arrays 之间找到 Python 的差异 - How to find differences in Python between two sizable 2D arrays
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM