[英]Understanding the algorithm of Median of Two Sorted Arrays
There are two sorted arrays A and B of size m and n respectively. 分别有大小为m和n的两个排序数组A和B。 Find the median of the two sorted arrays.
找到两个排序数组的中位数。 The overall run time complexity should be O(log (m+n)).
总体运行时复杂度应为O(log(m + n))。
I don't understand the formulas for calculating aMid, and bMid. 我不了解计算aMid和bMid的公式。 What's the logic behind these formulas?
这些公式背后的逻辑是什么?
int aMid = aLen * k / (aLen + bLen); int aMid = aLen * k /(aLen + bLen); // a's middle count
//一个中间计数
int bMid = k - aMid - 1; int bMid = k-aMid-1; // b's middle count
// b的中间计数
Here is the link to program. 这是程序链接。 http://www.programcreek.com/2012/12/leetcode-median-of-two-sorted-arrays-java/][1]
http://www.programcreek.com/2012/12/leetcode-median-of-two-sorted-arrays-java/][1]
public static double findMedianSortedArrays(int A[], int B[]) {
int m = A.length;
int n = B.length;
if ((m + n) % 2 != 0) // odd
return (double) findKth(A, B, (m + n) / 2, 0, m - 1, 0, n - 1);
else { // even
return (findKth(A, B, (m + n) / 2, 0, m - 1, 0, n - 1)
+ findKth(A, B, (m + n) / 2 - 1, 0, m - 1, 0, n - 1)) * 0.5;
}
}
public static int findKth(int A[], int B[], int k,
int aStart, int aEnd, int bStart, int bEnd) {
int aLen = aEnd - aStart + 1;
int bLen = bEnd - bStart + 1;
// Handle special cases
if (aLen == 0)
return B[bStart + k];
if (bLen == 0)
return A[aStart + k];
if (k == 0)
return A[aStart] < B[bStart] ? A[aStart] : B[bStart];
int aMid = aLen * k / (aLen + bLen); // a's middle count
// I AM STUCK HERE
int bMid = k - aMid - 1; // b's middle count
// make aMid and bMid to be array index
aMid = aMid + aStart;
bMid = bMid + bStart;
if (A[aMid] > B[bMid]) {
k = k - (bMid - bStart + 1);
aEnd = aMid;
bStart = bMid + 1;
} else {
k = k - (aMid - aStart + 1);
bEnd = bMid;
aStart = aMid + 1;
}
return findKth(A, B, k, aStart, aEnd, bStart, bEnd);
}
I got some idea, from the comments with the code, how these formulas are calculated but still don't understand to explain to someone "why these formulas" Or what's the logic behind these formulas? 我从代码注释中得到了一些想法,这些公式是如何计算的,但仍然不明白向某人解释“为什么使用这些公式”,或者这些公式背后的逻辑是什么?
For int aMid = aLen * k / (aLen + bLen); 对于int aMid = aLen * k /(aLen + bLen); // a's middle count As aMid = aLen / 2 --(i)
// a的中间计数为aMid = aLen / 2-(i)
and k = (aLen + bLen)/2, -->2 = (aLen + bLen)/k 并且k =(aLen + bLen)/ 2,-> 2 =(aLen + bLen)/ k
putting value of 2 in equ (i) 将2代入等式(i)
so aMid = aLen/(aLen + bLen)/k== aLen *k/ (aLen+bLen) 所以aMid = aLen /(aLen + bLen)/ k == aLen * k /(aLen + bLen)
and for int bMid = k - aMid - 1; 对于int bMid = k-aMid-1; // b's middle count
// b的中间计数
aMid + bMid + 1 = k must be satisfied to be able to make the conclusions it does when A[aMid] > B[bMid] 必须得出aMid + bMid + 1 = k才能得出当A [aMid]> B [bMid]时得出的结论
As for why aMid + bMid + 1 = k is significant: If A[aMid] is greater than B[bMid], you know that any elements in after A[aMid] in A can't be the kth element since there are too many elements in B lower than it (and would exceed k elements). 至于为什么aMid + bMid + 1 = k很重要:如果A [aMid]大于B [bMid],则您知道A中A [aMid]之后的任何元素都不能成为第k个元素,因为B中的许多元素都比其低(并且将超过k个元素)。 You also know that B[bMid] and any element before B[bMid] in B can't be the kth element since there are too few elements in A lower than it (there wouldn't be enough elements before B[bMid] to be the kth element).
您还知道B中的B [bMid]以及B [bMid]之前的任何元素都不能成为第k个元素,因为A中的元素比其低(在B [bMid]之前没有足够的元素是第k个元素)。
As you already mentioned: aMid + bMid + 1 = k
must be satisfied to be able to make the conclusions that: 正如您已经提到的:必须满足
aMid + bMid + 1 = k
才能得出以下结论:
when A[aMid] > B[bMid]
we can throw away everything before bMid
and everything after (including) aMid
, 当
A[aMid] > B[bMid]
我们可以丢弃bMid
之前的所有bMid
以及bMid
之后(包括)的aMid
,
because we know that there are bMid
+ aMid
+ 1
(from including aMid
) = k
elements smaller than A[aMid]
. 因为我们知道
bMid
+ aMid
+ 1
(包括aMid
) = k
元素小于A[aMid]
。 Therefor our median lies in the remaining arrays. 因此,我们的中位数位于其余数组中。
With this in mind it does not really matter how we set up our two mid values aMid
and bMid
in the first place. 考虑到这一点,我们首先如何设置两个中间值
aMid
和bMid
。 The only thing to take care of is not letting one of them cause an IndexOutOfBoundsException
. 唯一需要注意的是不要让其中一个引起
IndexOutOfBoundsException
。
int aMid = 0;
int bMid = k - aMid - 1;
if(bMid >= bLen) {
bMid = bLen - 1;
aMid = k - bMid - 1;
}
Would do the trick as well. 也会做到的。 But it would take more than
O(log(n+m))
time because in the worst case we only always skip one element ( A[0]
). 但这会花费
O(log(n+m))
时间,因为在最坏的情况下,我们总是只跳过一个元素( A[0]
)。
What we want is to always throw away a percentage of aLen + bLen
. 我们想要的是始终丢弃
aLen + bLen
。
In our case this is: 在我们的例子中是:
A > B: k = k - (bMid +1) = k - (k - aMid) = aMid = k * (aLen / (aLen + bLen))
A> B:k = k-(bMid +1)= k-(k-aMid)= aMid = k *(aLen /(aLen + bLen))
B > A: k = k - (aMid + 1) = k - (k * aLen / (aLen + bLen)) -1 = k * (bLen / (aLen + bLen)) - 1B> A:k = k-(aMid + 1)= k-(k * aLen /(aLen + bLen))-1 = k *(bLen /(aLen + bLen))-1
Ignoring the -1 and assuming that the probability for A > B
is the same as B > A
we get: 忽略-1并假设
A > B
的概率与B > A
相同,我们得到:
E(k) = 0.5 * k * (aLen/(aLen + bLen)) + 0.5 * k * (bLen/(aLen + bLen))
= 0.5 * k (aLen + bLen)/(aLen + bLen) = 0.5 * k
Meaning that we get approximately O(log(n + m))
recursive calls until k
is 0 and then the functions stops. 意味着我们得到大约
O(log(n + m))
递归调用,直到k
为0,然后函数停止。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.