简体   繁体   English

测试字符串是否为字符串列表中任何子字符串的有效方法

[英]Efficient way to test if a string is substring of any in a list of strings

I want to know the best way to compare a string to a list of strings. 我想知道将字符串与字符串列表进行比较的最佳方法。 Here is the code I have in my mind, but it's clear that it's not good in terms of time complexity. 这是我想到的代码,但是很显然,在时间复杂度方面不好。

for (String large : list1) {
    for (String small : list2) {
        if (large.contains(small)) {
            // DO SOMETHING
        } else {
            // NOT FOR ME
        }
    }

    // FURTHER MANIPULATION OF STRING 
}

Both lists of strings can contain more than thousand values, so the worst case complexity can rise to 1000×1000×length which is a mess. 两个字符串列表都可以包含上千个值,因此最坏情况下的复杂度可能会增加到1000×1000×length,这很混乱。 I want to know the best way to perform the task of comparing a string with a list of strings, in the given scenario above. 我想知道在上述给定情况下执行将字符串与字符串列表进行比较的最佳方法。

You could just do this: 您可以这样做:

 for (String small : list2) {
    if (set1.contains(small)) {
        // DO SOMETHING
    } else {
        // NOT FOR ME
    }
}

set1 should be the larger list of String, and instead of keeping it as a List<String> , use a Set<String> or a HashSet<String> set1应该是String的较大列表,而不是将其保留为List<String> ,而应使用Set<String>HashSet<String>

Thanks to the first answer by sandeep. 感谢sandeep的第一个回答。 Here is the solution: 解决方法如下:

List<String> firstCollection = new ArrayList<>();
Set<String> secondCollection = new HashSet<>();

//POPULATE BOTH LISTS HERE.

for(String string: firstCollection){
    if(secondCollection.contains(string)){
        //YES, THE STRING IS THERE IN THE SECOND LIST
    }else{
        //NOPE, THE STRING IS NOT THERE IN THE SECOND LIST
    }
}

This is, unfortunately, a difficult and messy problem. 不幸的是,这是一个困难而混乱的问题。 It's because you're checking whether a small string is a substring of a bunch of large strings, instead of checking that the small string is equal to a bunch of large strings. 这是因为您要检查小字符串是否是一堆大字符串的字符串,而不是检查小字符串是否等于一堆大字符串。

The best solution depends on exactly what problem you need to solve, but here is a reasonable first attempt: 最佳解决方案取决于您到底需要解决什么问题,但这是一个合理的尝试:

In a temporary place, concatenate all the large strings together, then construct a suffix tree on this long concatenated string. 在一个临时位置,将所有大字符串连接在一起,然后在这个长的连接字符串上构造一个后缀树 With this structure, we should be able to find all the substring matches of any given small among all the large quickly. 有了这个结构,我们应该能够找到任何给定的所有子字符串匹配small间所有的large快。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将字符串列表与具有字符串字段的对象列表进行比较的最有效方法 - Most efficient way to compare a list of strings to a list of objects with a string field 是否有一种有效的方法来检测字符串是否包含一大组特征字符串中的 substring? - Is there an efficient way to detect if a string contains a substring which is in a large set of characteristic strings? 以更有效的方式替换字符串中的一组子字符串? - Replace a set of substring in a string in more efficient way? 搜索String数组以查找子字符串的最有效方法 - Most efficient way to search String array for substring 查找包含子字符串的Arraylist的所有字符串的有效方法 - Efficient way of finding all strings of an Arraylist, which contains a substring 在单个列表中比较字符串的最有效方法 - Most efficient way of comparing Strings in a single list 在Java中搜索字符串中的一组字符串的有效方法 - Efficient way to search for a set of strings in a string in Java 将字符串插入已排序的数组字符串列表中的最有效方法是什么? - What's the most efficient way to insert a string into an already-sorted array list of strings? Java 一次替换字符串中的多个不同的 substring(或以最有效的方式) - Java Replacing multiple different substring in a string at once (or in the most efficient way) 搜索String数组以获取子字符串并返回多个值的最有效方法? - Most efficient way to search String array for substring and return multiple values?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM