简体   繁体   English

搜索最长的重复字节序列

[英]Search for the longest repeating sequence of bytes

working with bytes i need to find the longest repeating sequence. 使用字节,我需要找到最长的重复序列。 The longest repeating sequence is 23 最长重复序列是23

23 56 23 8888 23 56 23 8888

Since it occurs in a byte list twice in sequence from the sequence 8. I decided to act on this path. 由于它在字节列表中从序列8开始依次出现两次,因此我决定在此路径上执行操作。

  1. Take the first digit and check whether it is anywhere else in the 取第一个数字并检查它是否在其他地方
    list if not then take the next number 如果没有列出,则取下一个号码

2 356 2 38888 2 356 2 38888

  1. After that I check whether the numbers of the standing match for the first ones coincide, if yes I put them in a list, then I continue checking (for example, if after both 23 numbers would have coincided), if I do not then I take another number. 之后,我会检查前两个比赛的常规比赛号码是否一致,如果是,我将它们放在列表中,然后我继续检查(例如,如果两个23个号码之后都会一致),否则我会继续检查取另一个号码。

23 56 23 8888 23 56 23 8888

My method 我的方法

List<Byte> topList = new ArrayList<>(); // byte list
List<Byte> result = new ArrayList<>(); // result list
for (int i = 0; i < topList.size(); i += count) {
            count = 1;
            for (int j = i + 1; j < topList.size(); j += count) {
                if (topList.get(i).equals(topList.get(j))&& !result.contains(topList.get(j))) {
                    result.add(topList.get(i));
                    result.add(topList.get(j));
                    for (int k = j + 1; k < topList.size(); k++) {
                        if (topList.get(k).equals(topList.get(i + count)) ) {
                            result.add(topList.get(k));
                            System.out.println(result);
                            count++; // step to pass already checked numbers
                        }
                    }
                }
            }
        }

But my code does not work correctly. 但是我的代码无法正常工作。

2238888 2238888

I get the sequence data.Tell me how you can improve it, you can not use string 我得到了序列数据告诉我如何改进它,不能使用字符串

I don't see any alternative to an O(n^2) solution, where, starting at each position in the input, we generate each forward sequence and check if we've seen it before, keeping the longest. 我看不到O(n^2)解决方案的任何替代方案,在该解决方案中,从输入中的每个位置开始,我们生成每个正向序列,并检查是否以前看过它,并保持最长。 Fortunately we don't need to consider sequences shorter than the current longest sequence, and we don't need to consider sequences longer then n/2 , where n is the size of the input, since these can't repeat. 幸运的是,我们不需要考虑比当前最长序列短的序列,也不需要考虑比n/2长的序列,其中n是输入的大小,因为它们不能重复。 Also, we don't consider sequences that break repeating characters, since these are to be treated as indivisible. 另外,我们不考虑破坏重复字符的序列,因为它们被视为不可分割的。

Here's a simple implementation that uses a Set to keep track of which sequences have been seen before. 这是一个简单的实现,它使用Set来跟踪以前看到过的序列。 In reality you'd want to use a more sophisticated structure that's more compact and exploits the pattern in the elements, but this will suffice for now to validate that we're generating the required output. 实际上,您希望使用更复杂的结构,该结构更紧凑并且可以利用元素中的模式,但这现在足以验证我们正在生成所需的输出。

static List<Byte> longestRepeatingSeq(List<Byte> in)
{
    int n = in.size();
    Set<List<Byte>> seen = new HashSet<>();
    List<Byte> max = Collections.<Byte> emptyList();
    for (int i=0; i<n; i++)
    {
        for (int j =i+max.size()+1; j<=n && j<=i +n/2; j++)
        {
            if (j == n || in.get(j) != in.get(j - 1))
            {
                List<Byte> sub = in.subList(i, j);
                if (seen.contains(sub))
                {
                    if (sub.size() > max.size())
                    {
                        max = sub;
                    }
                } 
                else
                {
                    seen.add(sub);
                }
            }
        }
    }
    return max;
}

Test: 测试:

public static void main(String[] args)
{
    String[] tests = 
        {
            "123123",
            "235623",
            "2356238888",
            "88388",
            "883883",
            "23235623238888",
        };

    for(String s : tests)
    {
        List<Byte> in = new ArrayList<>();
        for(String ns : s.split("")) in.add(Byte.parseByte(ns));
        System.out.println(s + " " + longestRepeatingSeq(in));
    }
}

Output: 输出:

123123 [1, 2, 3]
235623 [2, 3]
2356238888 [2, 3]
88388 [8, 8]
883883 [8, 8, 3]
23235623238888 [2, 3, 2, 3]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM