简体   繁体   English

使用线程计算所有出现的 substring

[英]Using threads to count all occurrences of a substring

Supposing that I have t threads, what is the optimal solution for counting all non-overlapping occurrences of a substring T in a string S?假设我有 t 个线程,计算字符串 S 中 substring T 的所有非重叠出现的最佳解决方案是什么?

This is a chunk of code that does the normal count, but I'm not sure how to implement it concurrently.这是一段正常计数的代码,但我不确定如何同时实现它。 What would happen if t is smaller than length of the substring?如果 t 小于 substring 的长度会怎样?

public class Substrings {
public int countOccurrences(String S, String T) {
  int count = 0, offset = 0, index;
  while((index = S.indexOf(T, offset)) != -1) {
    offset = index + T.length();
            count++;
  }
  return count;
}

} }

I'm not sure why you want to do this as operations like this are pretty fast, unless you have alot of giant Strings.我不确定你为什么要这样做,因为这样的操作非常快,除非你有很多巨大的字符串。 The optimal solution is not something I want to think about, but there is a simple way to do this that's about twice as slow as the optimal.最佳解决方案不是我想考虑的,但是有一种简单的方法可以做到这一点,它的速度大约是最佳解决方案的两倍。 Split the String up into sections and run countOccurrences on each section using a thread.将字符串拆分为多个部分,并使用线程在每个部分上运行 countOccurrences。 Put all the indexes you found into a Set.将您找到的所有索引放入一个集合中。 Then slide the sections forward half the length of a section and do it again.然后将这些部分向前滑动一段长度的一半,然后再做一次。 This second part will find any occurrences that span sections.第二部分将查找跨节的任何事件。 Of course you could limit this second search by the length of the String on both sides, but this would complicate the code.当然,您可以通过两侧字符串的长度来限制第二次搜索,但这会使代码复杂化。 As an exercise, perhaps you can do this with Boyer-Moore.作为一个练习,也许你可以用 Boyer-Moore 来做这个。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM