简体   繁体   中英

Why does LINQ to Entities string.Contains(string.Empty) match anything but string.Contains(string.Empty.ToLower()) match nothing?

I am writing a query for a repository service for an inventory table. FYI: We are using C# 7.0, EF 6, and we are using Moq for testing our queries.

I learned that when string.Contains(...) , which is by default case sensitive, is put into a LINQ query and then converted to SQL, the result is case in sensitive (found other SO posts to help with that and we'll deal with it), and I also found that the string.Contains(...) functions seems to have a quirk when the argument is string.Empty and is converted to lower case (found no SO posts about this).

Attempts to use the case-insensitive string.Contains(...) overloads are beaten back with an exception when LINQ to Entities attempts to convert to SQL, so I have to manually specify column.Contains(argument.ToLower()) in order for both the the LINQ to Entities' SQL query to operate as intended and for the mocked-up unit test for case-insensitivity to pass.

Problem: If the argument is string.Empty, nothing is matched. The culprit is when then argument is converted to lower case.

This is not a roadblock (simply moving the argument.ToLower() check outside the query solved the issue, and it'd be a tad more efficient anyway), but I still want to know what's up.

public List<InventoryModel> FindByTrackingNumberSubstring( string substring )
{
    // (bad) matches nothing when argument is string.Empty
    //var query = _modelTable.Where( entity => entity.Tracking_Number.ToLower().Contains( substring.ToLower() ) );

    // (good) matches everything when argument is string.Empty
    string lower = substring.ToLower();
    var query = _modelTable.Where( entity => entity.Tracking_Number.ToLower().Contains( lower ) );

    return query.ToList<InventoryModel>();
}

// SQL for queries 1 and 2, respectively (stripped out SELECT and FROM stuff for brevity)
WHERE ((CASE WHEN (( CAST(CHARINDEX(LOWER(@p__linq__0), LOWER([Extent1].[Tracking Number])) AS int)) > 0) THEN cast(1 as bit) WHEN ( NOT (( CAST(CHARINDEX(LOWER(@p__linq__0), LOWER([Extent1].[Tracking Number])) AS int)) > 0)) THEN cast(0 as bit) END) = 1)
WHERE ((CASE WHEN (LOWER([Extent1].[Tracking Number]) LIKE @p__linq__0 ESCAPE N'~') THEN cast(1 as bit) WHEN ( NOT (LOWER([Extent1].[Tracking Number]) LIKE @p__linq__0 ESCAPE N'~')) THEN cast(0 as bit) END) = 1)

I did some checking and found that, in the LINQ to Entities' SQL query, that string.Contains(string.Empty) matches anything, and I found that string.Empty.ToLower() == string.Empty match anything, but put these two together and C# and LINQ to Entities diverge. In the former, string.Contains(string.Empty.ToLower()) matches anything (as expected), but in the latter matches nothing..

Why?

I believe this would be a quirk of the SQL Server provider for EF in that when you perform the .ToLower() on the criteria and the field being compared it is recognizing the request as explicitly case-insensitive and replaces the LIKE query with the CHARINDEX comparison which does not handle empty strings in SQL Server the same way. The behaviour for case sensitivity will depend on the database engine, and in the case of SQL Server, the collation selected for strings in the database. Not sure why a LOWER(Tracking_Number) LIKE LOWER('%%') couldn't have been used.

Personally, when composing EF Linq expressions, my querying code will always inspect for IsNullOrEmpty on strings and not append .Where() conditions where an actual criteria has not been supplied. This way WHERE clauses are only applied for provided criteria.

Ie If I trust the DB won't be collated case-sensitive:

if(!string.IsNullOrEmpty(substring))
    query = query.Where(entity => entity.Tracking_Number.Contains(substring));

if I am concerned that the database could be collated case-sensitive:

if(!string.IsNullOrEmpty(substring))
    query = query.Where( entity => entity.Tracking_Number.ToLower().Contains(substring.ToLower()));

Even there I would prefer to set a standard that Tracking_Number is always stored as a lower-case value if the database serves solely this application. The entity properties would enforce that any set value is lower-cased. (removing the need for .Tracking_Number.ToLower() in the queries.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM