简体   繁体   English

DP:最长递增子序列的思考过程和解决方案

[英]DP: Longest Increasing Subsequence Thought Process & Solution

For the Longest Increasing Subsequence problem I envisioned keeping a DP array that is always in order keeping the max value at the farthest end. 对于最长增长子序列问题,我设想保留一个DP数组,该数组始终应将最大值保留在最远端。 Something that would look like this: 看起来像这样的东西:

{1, 1, 2, 3, 3, 4, 5, 6, 6, 6}

The thought process I followed to produce my first incorrect solution was, we want to look at the entire array starting with only the first element, calculate the LIS, then incrementally add on a value to the end of our array. 为了产生第一个不正确的解决方案,我所遵循的思考过程是,我们希望仅从第一个元素开始查看整个数组,计算LIS,然后将值递增地添加到数组的末尾。 While doing this, we incrementally calculate the LIS in our DP array to the LIS of the old subarray plus the new element we added on. 在执行此操作时,我们将DP数组中的LIS增量计算为旧子数组的LIS加上我们添加的新元素。 This means at index i of the dp array exists the value of the LCS of the subarray of length i . 这意味着在dp数组的索引i处存在长度为i的子数组的LCS的值。

More clearly put 更明确地说

array => {5, 6, 7, 1, 2, 3, 4}
dp    => {1, 2, 3, 3, 3, 3, 4}

This way the very last entry of the DP array will be the LIS of the current array. 这样,DP阵列的最后一个条目将是当前阵列的LIS。 This would act as our invariant, so when we get the end, we can be assured that the last value is the only one we need. 这将成为我们的不变式,因此,当我们达到终点时,可以确保最后的值是我们唯一需要的值。 It then dawned on me that while we're traversing an array with a DP kinda feel, the next value does not depend on any of the previously tabulated values in the array, so this method is the same as maintaining a maxLIS variable, a pattern I've seen in many O(n) solutions. 然后让我想到,在遍历具有DP类感觉的数组时,下一个值不依赖于该数组中以前列出的任何值,因此此方法与维护maxLIS变量(一种模式)相同我已经在许多O(n)解决方案中看到过。 So my closest-to-correct solution is as follows: 因此,我最接近正确的解决方案如下:

1.) Save a copy of the input array/vector as old 1.)将输入数组/向量的副本保存为old

2.) Sort the original input array 2.)对原始输入数组进行排序

3.) Traverse the sorted array, incrementing a variable longest by one every time the next value (which should be larger than the current`) appears before the current in the original array. 3.)遍历排序后的数组,每当下一个值(该值应大于当前值)出现在原始数组中的当前值之前,变量就longest递增一个。

4.) Return longest 4.)返回longest

The code would be ~this: 代码是〜this:

int lengthOfLIS(vector<int>& seq) {
  if (!seq.size()) return 0;
  vector<int> old = seq;

  sort(seq.begin(), seq.end());

  int longest = 1;

  for (int i = 1; i < seq.size(); ++i) {
    if (seq[i] > seq[i-1] && find(old.begin(), old.end(), seq[i]) - old.begin() > find(old.begin(), old.end(), seq[i-1]) - old.begin()) longest++;
  }
  return longest;
}

Where we have the find method (I'm assuming a linear operation) we could make a constant operation by just making a data structure to store the original index of the value along with the the value itself so we don't have to do any traversing to find the index of an element in the original array ( old ). 在使用find方法的情况下(我假设是线性操作),我们可以通过制作一个数据结构来存储值的原始索引以及值本身来进行常量操作,因此我们无需执行任何操作遍历以查找原始数组( old )中元素的索引。 I believe this would be an O(nlog(n)) solution however fails with this input array: [1,3,6,7,9,4,10,5,6] . 我相信这将是一个O(nlog(n))解决方案,但是此输入数组失败: [1,3,6,7,9,4,10,5,6] CHECK HERE 在这里检查

Finally I did some research and I found that all solution guides I have read sneak in the fact that their solution keeps the values of their DP array not in order, but instead like this: A value in the DP array represents the length of an increasing subsequence with the last value of the subsequence being the value of originalArray[index] . 最终,我进行了一些研究,发现我读过的所有解决方案指南的事实都是,他们的解决方案使DP数组的值不按顺序排列,而是像这样:DP数组中的值表示递增的长度子序列,该子序列的最后一个值是originalArray[index]的值。

More clearly put, 更明确地说,

array => {5, 6, 7, 1, 2, 3}
dp    => {1, 2, 3, 1, 2, 3}

Here, where 5 is the last value of an increasing subsequence, no values come before it so it must be of length 1. If 6 is the last value of an increasing subsequence, we must look at all values before it to determine how long a subsequence ending with 6 can be. 此处,其中5是递增子序列的最后一个值,在它之前没有值,因此它的长度必须为1。如果6是递增子序列的最后一个值,我们必须查看它之前的所有值以确定a以6结尾的子序列即可。 Only 5 can come before it, thus making the longest increasing subsequence thus far 2. This continues, and you return the maximum value in the DP array. 它之前只能有5个,因此是到目前为止增加的最长子序列2。这将继续,您将在DP数组中返回最大值。 Time complexity for this solution is O(n^2) , standard naive solution. 该解决方案的时间复杂度为O(n^2) ,即标准天真解决方案。

Questions: 问题:

I'm curious as to how I can think about this problem correctly. 我很好奇如何正确思考这个问题。 I want to fine-tune my thought process so that I can come up with an optimal solution from scratch (that's the goal at least) so I'd like to know 我想微调我的思维过程,以便我可以从头开始提出一个最佳解决方案(至少是目标),所以我想知道

1.) What property of this problem should've triggered to me to use a DP array differently than how I would've used it? 1.)这个问题的什么属性应该触发我使用与使用它不同的DP阵列? In hindsight, my original way was simply equivalent to keeping a max variable but even then I struggle seeing a property of this problem that would trigger the thought `Hey, the value of an entry in my DP array at index i should be the length of the increasing subsequence ending with originalArray[i]. 事后看来,我的原始方式仅相当于保留max变量,但即使那样,我仍然很难看到该问题的性质,这会触发思想“嘿,我的DP数组中索引为i的条目的值应该是以originalArray [i]结尾的递增子序列。 I'm struggling to see how I should've come up with that. 我正在努力查看该如何解决。

2.) Is it possible to get my proposed O(nlog(n)) solution to work? 2.)是否可以使我建议的O(nlog(n))解决方案正常工作? I know an O(nlog(n)) solution exists, but since I can't get mine working I think I need a nudge in the right direction. 我知道存在O(nlog(n))解决方案,但是由于我无法正常工作,因此我认为我需要朝正确的方向轻推。

I admit, it is an interesting question and i do not have exact answer to it but i guess i can give you a nudge in right direction. 我承认,这是一个有趣的问题,我没有确切的答案,但是我想我可以向您介绍正确的方向。 So here it goes: 所以就这样:

While facing with such dilemma, I would usually turn to the basics. 面对这样的困境,我通常会转向基础知识。 Like in your case go through definition of Dynamic Programming. 就像您的情况一样,请经过动态编程的定义。 It has two properties: 它具有两个属性:

  1. Overlapping Subproblems 重叠子问题
  2. Optimal Substructure. 最佳子结构。

You can easily find these property reflecting in standard solution but not yours. 您可以轻松找到标准解决方案中反映的这些属性,但您的却不是。 You can read about them in cormen or just google them in context of DP. 您可以在cormen中阅读有关它们的信息,也可以在DP的上下文中对其进行谷歌搜索。

In my opinion your solution is not a DP, you just found some pattern and your are trying to solve based on this pattern. 我认为您的解决方案不是DP,您只是找到了某种模式,并且您正在尝试根据该模式进行解决。 If you are not getting the solution, it means that either your pattern is wrong or your solution is overlooking something. 如果您没有得到解决方案,则意味着您的模式错误或解决方案无所适从。 In scenarios like this try to prove, mathematically, that the pattern you are observing is correct and prove that the solution should also work. 在这种情况下,请尝试从数学上证明您正在观察的模式是正确的,并证明该解决方案也应该有效。

Give me some more time, while i work through your solution but mean while you can also try to develop a proof for your solution. 在我研究您的解决方案时,请给我更多时间,但同时您也可以尝试为您的解决方案开发证明。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM