简体   繁体   English

检查字符串是否包含列表(字符串)中的元素

[英]Check if a string contains an element from a list (of strings)

For the following block of code:对于以下代码块:

For I = 0 To listOfStrings.Count - 1
    If myString.Contains(lstOfStrings.Item(I)) Then
        Return True
    End If
Next
Return False

The output is:输出是:

Case 1:情况1:

myString: C:\Files\myfile.doc
listOfString: C:\Files\, C:\Files2\
Result: True

Case 2:案例2:

myString: C:\Files3\myfile.doc
listOfString: C:\Files\, C:\Files2\
Result: False

The list (listOfStrings) may contain several items (minimum 20) and it has to be checked against a thousands of strings (like myString).列表 (listOfStrings) 可能包含多个项目(最少 20 个),并且必须针对数千个字符串(如 myString)进行检查。

Is there a better (more efficient) way to write this code?有没有更好(更有效)的方式来编写这段代码?

With LINQ, and using C# (I don't know VB much these days):使用 LINQ,并使用 C#(这些天我不太了解 VB):

bool b = listOfStrings.Any(s=>myString.Contains(s));

or (shorter and more efficient, but arguably less clear):或(更短、更高效,但可能不太清楚):

bool b = listOfStrings.Any(myString.Contains);

If you were testing equality, it would be worth looking at HashSet etc, but this won't help with partial matches unless you split it into fragments and add an order of complexity.如果您正在测试相等性,则值得查看HashSet等,但这对部分匹配无济于事,除非您将其拆分为片段并添加复杂性。


update: if you really mean "StartsWith", then you could sort the list and place it into an array ;更新:如果您的意思是“StartsWith”,那么您可以对列表进行排序并将其放入一个数组中; then use Array.BinarySearch to find each item - check by lookup to see if it is a full or partial match.然后使用Array.BinarySearch查找每个项目 - 通过查找检查它是完全匹配还是部分匹配。

当你构造你的字符串时,它应该是这样的

bool inact = new string[] { "SUSPENDARE", "DIZOLVARE" }.Any(s=>stare.Contains(s));

I liked Marc's answer, but needed the Contains matching to be CaSe InSenSiTiVe.我喜欢 Marc 的回答,但需要包含匹配才能成为 CaSe InSenSiTiVe。

This was the solution:这是解决方案:

bool b = listOfStrings.Any(s => myString.IndexOf(s, StringComparison.OrdinalIgnoreCase) >= 0))

There were a number of suggestions from an earlier similar question " Best way to test for existing string against a large list of comparables ".早期的类似问题“ 针对大量可比较项测试现有字符串的最佳方法”提出了许多建议。

Regex might be sufficient for your requirement.正则表达式可能足以满足您的要求。 The expression would be a concatenation of all the candidate substrings, with an OR " | " operator between them.该表达式将是所有候​​选子字符串的串联,在它们之间使用 OR “ | ” 运算符。 Of course, you'll have to watch out for unescaped characters when building the expression, or a failure to compile it because of complexity or size limitations.当然,您在构建表达式时必须注意未转义的字符,或者由于复杂性或大小限制而导致编译失败。

Another way to do this would be to construct a trie data structure to represent all the candidate substrings (this may somewhat duplicate what the regex matcher is doing).另一种方法是构造一个trie 数据结构来表示所有候选子字符串(这可能会复制正则表达式匹配器正在做的事情)。 As you step through each character in the test string, you would create a new pointer to the root of the trie, and advance existing pointers to the appropriate child (if any).当您逐步遍历测试字符串中的每个字符时,您将创建一个指向特里树根的新指针,并将现有指针前进到适当的子节点(如果有)。 You get a match when any pointer reaches a leaf.当任何指针到达叶子时,您会得到匹配项。

Old question.老问题。 But since VB.NET was the original requirement.但由于VB.NET是最初的要求。 Using the same values of the accepted answer:使用已接受答案的相同值:

listOfStrings.Any(Function(s) myString.Contains(s))

Based on your patterns one improvement would be to change to using StartsWith instead of Contains.根据您的模式,一项改进是改为使用 StartsWith 而不是包含。 StartsWith need only iterate through each string until it finds the first mismatch instead of having to restart the search at every character position when it finds one. StartsWith 只需要遍历每个字符串,直到找到第一个不匹配项,而不必在找到每个字符位置重新开始搜索。

Also, based on your patterns, it looks like you may be able to extract the first part of the path for myString, then reverse the comparison -- looking for the starting path of myString in the list of strings rather than the other way around.此外,根据您的模式,您似乎可以提取 myString 路径的第一部分,然后反转比较——在字符串列表中查找 myString 的起始路径,而不是相反。

string[] pathComponents = myString.Split( Path.DirectorySeparatorChar );
string startPath = pathComponents[0] + Path.DirectorySeparatorChar;

return listOfStrings.Contains( startPath );

EDIT : This would be even faster using the HashSet idea @Marc Gravell mentions since you could change Contains to ContainsKey and the lookup would be O(1) instead of O(N).编辑:使用@Marc Gravell 提到的 HashSet 想法会更快,因为您可以将Contains更改为ContainsKey并且查找将是 O(1) 而不是 O(N)。 You would have to make sure that the paths match exactly.您必须确保路径完全匹配。 Note that this is not a general solution as is @Marc Gravell's but is tailored to your examples.请注意,这不是@Marc Gravell 的通用解决方案,而是针对您的示例量身定制的。

Sorry for the C# example.对不起 C# 示例。 I haven't had enough coffee to translate to VB.我还没有喝足够的咖啡来翻译成 VB。

我不确定它是否更有效,但您可以考虑在Lambda Expressions中使用它

Have you tested the speed?你测试过速度吗?

ie Have you created a sample set of data and profiled it?即您是否创建了一组样本数据并对其进行了分析? It may not be as bad as you think.它可能没有你想象的那么糟糕。

This might also be something you could spawn off into a separate thread and give the illusion of speed!这也可能是你可以产生到一个单独的线程并给人一种速度错觉的东西!

myList.Any(myString.Contains);

If speed is critical, you might want to look for the Aho-Corasick algorithm for sets of patterns.如果速度很重要,您可能需要为模式集寻找Aho-Corasick 算法

It's a trie with failure links, that is, complexity is O(n+m+k), where n is the length of the input text, m the cumulative length of the patterns and k the number of matches.这是一个带有失败链接的尝试,即复杂度为 O(n+m+k),其中 n 是输入文本的长度,m 是模式的累积长度,k 是匹配的数量。 You just have to modify the algorithm to terminate after the first match is found.您只需要修改算法以在找到第一个匹配项后终止。

The drawback of Contains method is that it doesn't allow to specify comparison type which is often important when comparing strings. Contains方法的缺点是它不允许指定比较类型,这在比较字符串时通常很重要。 It is always culture-sensitive and case-sensitive.它始终区分文化和大小写。 So I think the answer of WhoIsRich is valuable, I just want to show a simpler alternative:所以我认为 WhoIsRich 的回答很有价值,我只想展示一个更简单的替代方案:

listOfStrings.Any(s => s.Equals(myString, StringComparison.OrdinalIgnoreCase))

As I needed to check if there are items from a list in a (long) string, I ended up with this one:因为我需要检查(长)字符串中的列表中是否有项目,所以我最终得到了这个:

listOfStrings.Any(x => myString.ToUpper().Contains(x.ToUpper()));

Or in vb.net:或者在 vb.net 中:

listOfStrings.Any(Function(x) myString.ToUpper().Contains(x.ToUpper()))

Slight variation, I needed to find if there were whole words and case insensitive in a string.略有变化,我需要找出字符串中是否有完整的单词和不区分大小写的。

myString.Split(' ', StringSplitOptions.RemoveEmptyEntries).Intersect(listOfStrings).Any())

for case insensitive myString and listOfStrings have been converted to uppercase.对于不区分大小写的myStringlistOfStrings已转换为大写。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 检查字符串是否包含列表中的元素并将其存储 - Check if string contains an element from list and store it MongoDB C#驱动程序->检查字符串是否包含列表中的元素(字符串) - MongoDB C# Driver -> Check if a string contains an element from a list (of strings) 检查单词是否包含字符串列表中的子字符串 - check if word contains substring from list of strings 检查列表是否包含包含字符串的元素并获取该元素 - Check if list contains element that contains a string and get that element 检查字符串列表是否包含对象属性值,是否将其从字符串列表中删除 - Check if list of string contains object property value, and if it does remove it from the list of strings 如何使用 Linq 检查字符串列表是否包含列表中的任何字符串 - How to use Linq to check if a list of strings contains any string in a list 如何检查字符串是否包含实体框架中列表中的任何字符串? - How do you check if a string contains any strings from a list in Entity Framework? 如何检查一行是否包含字符串列表中的特定字符串? - How can I check if a line contains specific string from a list of strings? 如何检查字符串是否包含列表/数组中的任何字符串 - How to check if String contains any of the strings in List/Array 如何检查字符串是否包含字符串数组中的字符串? - How do I check if a string contains a string from an array of strings?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM