简体   繁体   English

Java:String:equalsIgnoreCase vs将所有内容切换为大写/小写

[英]Java: String: equalsIgnoreCase vs switching everything to Upper/Lower Case

It came to my attention that there a several ways to compare strings in Java. 我注意到有几种方法可以比较Java中的字符串。

I just got in the habit ages ago to use equalsIgnoreCase to avoid having problems with case sensitive strings. 我刚刚习惯于使用equalsIgnoreCase以避免出现区分大小写的字符串问题。

Others on the other hand prefer passing everything in upper or lower case. 另一方面,其他人喜欢以大写或小写的方式传递所有内容。

From where I stand (even if technically I'm sitting), I don't see a real difference. 从我的立场(即使技术上我坐着),我看不出真正的区别。

Does anybody know if one practice is better than the other? 有人知道一种做法是否优于另一种做法? And if so why? 如果是这样,为什么?

Use equalsIgnoreCase because it's more readable than converting both Strings to uppercase before a comparison. 使用equalsIgnoreCase是因为它比在比较之前将两个字符串转换为大写更具可读性。 Readability trumps micro-optimization . 可读性胜过微优化

What's more readable? 什么更具可读性?

if (myString.toUpperCase().equals(myOtherString.toUpperCase())) {

or 要么

if (myString.equalsIgnoreCase(myOtherString)) {

I think we can all agree that equalsIgnoreCase is more readable. 我想我们都同意equalsIgnoreCase更具可读性。

equalsIgnoreCase avoids problems regarding Locale-specific differences (eg in Turkish Locale there are two different uppercase "i" letters). equalsIgnoreCase避免了有关Locale特定差异的问题(例如,在Turkish Locale中有两个不同的大写“i”字母)。 On the other hand, Maps only use the equals() method. 另一方面,Maps仅使用equals()方法。

But the issue in the latter, where you make an assumption that either upper or lower case is passed, you cannot blindly trust the caller. 但是在后者的问题中,你假设无论是大写还是小写,你都不能盲目地信任来电者。 So you have to include an ASSERT statement at the start of the method to make sure that the input is always in the case your are expecting. 因此,您必须在方法的开头包含ASSERT语句,以确保输入始终处于您期望的情况。

Neither is better, they both have their uses in different scenarios. 两者都不是更好,它们都可以在不同场景中使用。

Many times when you have to do string comparisons there is the opportunity to massage at least one of the strings to make it easier to compare, and in these cases you will see strings converted to a particular case, trimmed, etc before being compared. 很多时候,当你必须进行字符串比较时,有机会按摩至少一个字符串以便于比较,在这些情况下,你会看到字符串在被比较之前转换为特定的情况,修剪等。

If, on the other hand, you just want to do an on-the-fly case-insensitive comparison of two strings then feel free to use equalsIgnoreCase , that's what its there for after all. 另一方面,如果你只是想对两个字符串进行即时不区分大小写的比较,那么可以随意使用equalsIgnoreCase ,这就是它的用途。 I would caution, however, that if you're seeing a lot of equalsIgnoreCase it could be a code smell. 但是,我要提醒一下,如果你看到很多equalsIgnoreCase它可能是代码味道。

Performance wise both are same according to this post: 根据这篇文章,性能方面都是相同的:

http://www.params.me/2011/03/stringtolowercasestringtouppercase-vs.html http://www.params.me/2011/03/stringtolowercasestringtouppercase-vs.html

So I would decide based on code readabilty, in some case toLowerCase() would be better if I am passing a value always to a single method to create objects, otherwise equalsIgnoreCase() makes more sense. 所以我会基于代码读取性来决定,在某些情况下,如果我总是将值传递给单个方法来创建对象,那么toLowerCase()会更好,否则equalsIgnoreCase()会更有意义。

It depends on the use case. 这取决于用例。

If you're doing a one to one string comparison, equalsIgnoreCase is probably faster, since internally it just uppercases each character as it iterates through the strings (below code is from java.lang.String), which is slightly faster than uppercasing or lowercasing them all before performing the same comparison: 如果您正在进行一对一的字符串比较,则equalsIgnoreCase可能更快,因为在内部它只是在每个字符重叠时在字符串中迭代(代码来自java.lang.String),这比bigcasing或lowercasing稍快一些在执行相同的比较之前,他们都是:

if (ignoreCase) 
{
    // If characters don't match but case may be ignored,
    // try converting both characters to uppercase.
    // If the results match, then the comparison scan should
    // continue.
    char u1 = Character.toUpperCase(c1);
    char u2 = Character.toUpperCase(c2);
    if (u1 == u2) {
        continue;
    }
    // Unfortunately, conversion to uppercase does not work properly
    // for the Georgian alphabet, which has strange rules about case
    // conversion.  So we need to make one last check before
    // exiting.
    if (Character.toLowerCase(u1) == Character.toLowerCase(u2)) {
        continue;
    }
}

But when you have a situation where you want to do lookups against a data structure full of strings (especially strings that are all in the US Latin/ASCII space) in a case insensitive manner, it will be quicker to trim/lowercase the strings to be checked against and put them in something like a HashSet or HashMap. 但是当你想要以不区分大小写的方式对充满字符串的数据结构(尤其是美国拉丁语/ ASCII空间中的字符串)进行查找时,将字符串修剪/小写更快检查并将它们放在HashSet或HashMap之类的东西中。

This is better than calling equalsIgnoreCase on each element of a List because the slight performance gain of equalsIgnoreCase() is canceled out by the fact that you're basically doing a modified version of contains() against an array, which is O(n). 这比在List的每个元素上调用equalsIgnoreCase更好,因为equalsIgnoreCase()的轻微性能增益被你基本上对一个数组做的修改版本的contains()取消了,这是O(n) 。 With a pre-normalized string you can check against the entire list of strings with a single contains() call that runs in O(1). 使用预规范化的字符串,您可以使用在O(1)中运行的单个contains()调用来检查整个字符串列表。

equalsIgnoreCase documentation in jdk 8 jdk 8中的equalsIgnoreCase文档

  • Compares this String to another String, ignoring case considerations. 将此String与另一个String进行比较,忽略大小写。 Two strings are considered equal ignoring case if they are of the same length and corresponding characters in the two strings are equal ignoring case. 如果两个字符串具有相同的长度并且两个字符串中的相应字符等于忽略大小写,则认为它们是相等的忽略大小写。

    Two characters c1 and c2 are considered the same ignoring case if at least one of the following is true: 如果至少满足下列条件之一,则两个字符c1和c2被视为相同的忽略大小写:

    • The two characters are the same (as compared by the == operator) 这两个字符是相同的(通过==运算符进行比较)
    • Applying the method java.lang.CharactertoUpperCase(char)to each character produces the same result 将方法java.lang.CharactertoUpperCase(char)应用于每个字符会产生相同的结果
    • Applying the method java.lang.CharactertoLowerCase(char) to each character produces the same result 将方法java.lang.CharactertoLowerCase(char)应用于每个字符会产生相同的结果

My thoughts: 我的想法:

So using equalsIgnoreCase we iterate through the Strings (only if their size values are the same) comparing each char. 因此,使用equalsIgnoreCase,我们遍历Strings(只有当它们的大小值相同时)才比较每个char。 In the worst case, we will performance will be O( 3cn ) where n = the size of your strings. 在最坏的情况下,我们将表现为O(3cn),其中n =字符串的大小。 We will use no extra space. 我们不会使用额外的空间。

Using toUpper() then comparing if the strings are equal, you ALWAYS loop through each string one time, converting all strings to upper, then do an equivalence by reference check (equals()). 使用toUpper()然后比较字符串是否相等,你总是循环遍历每个字符串一次,将所有字符串转换为upper,然后通过引用检查(equals())进行等价。 This is theta(2n + c). 这是theta(2n + c)。 But just remember, when you do toUpperCase(), you actually have to create two new Strings because Strings in Java are immutable. 但是请记住,当你执行toUpperCase()时,你实际上必须创建两个新的字符串,因为Java中的字符串是不可变的。

So I would say that equalsIgnoreCase is both more efficient and easier to read. 所以我想说equalsIgnoreCase更有效,更容易阅读。

Again I would consider the use case, because that would be what it comes down to for me. 我再次考虑用例,因为这对我来说就是最重要的。 The toUpper approach could be valid in certain use cases, but 98% of the time I use equalsIgnoreCase(). toUpper方法在某些用例中可能有效,但98%的时间我使用equalsIgnoreCase()。

When I'm working with English-only characters, I always run toUpperCase() or toLowerCase() before I start doing comparisons if I'm calling .equalsIgnoreCase() more than once or if I'm using a switch statement. 当我用英语只字的工作,我始终运行toUpperCase()toLowerCase()我开始做比较之前,如果我打电话.equalsIgnoreCase() 不止一次 ,或者如果我使用一个switch语句。 This way it does the case-change operation only once, and so is more efficient. 这样它只进行一次大小写更改操作,因此效率更高。

For example, in a factory pattern: 例如,在工厂模式中:

public static SuperObject objectFactory(String objectName) {
    switch(objectName.toUpperCase()) {
        case "OBJECT1":
            return new SubObject1();
            break;
        case "OBJECT2":
            return new SubObject2();
            break;
        case "OBJECT3":
            return new SubObject3();
            break;
    }
    return null;
}

(Using a switch statement is slightly faster than if..else if..else blocks for String comparison) (使用switch语句比if..else if..else阻止字符串比较稍快)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM