简体   繁体   English

使用实体框架搜索数据​​库时忽略重音

[英]Ignoring accents while searching the database using Entity Framework

I have a database table that contains names with accented characters.我有一个数据库表,其中包含带重音字符的名称。 Like ä and so on.ä等。

I need to get all records using EF4 from a table that contains some substring regardless of accents .我需要使用 EF4 从包含一些子字符串的表中获取所有记录,而不管重音如何

So the following code:所以下面的代码:

myEntities.Items.Where(i => i.Name.Contains("a")); 

should return all items with a name containing a , but also all items containing ä , â and so on.应该返回名称包含a所有项目,以及包含äâ等的所有项目。 Is this possible?这可能吗?

如果在“名称”列上设置了不区分重音的排序顺序,则查询应按要求运行。

Setting an accent-insensitive collation will fix the problem. 设置不区分重音的排序规则将解决问题。

You can change the collation for a column in SQL Server and Azure database with the next query. 您可以使用下一个查询更改SQL Server和Azure数据库中列的排序规则。

ALTER TABLE TableName
ALTER COLUMN ColumnName NVARCHAR (100)
COLLATE SQL_LATIN1_GENERAL_CP1_CI_AI NOT NULL

SQL_LATIN1_GENERAL_CP1_CI_AI is the collation where LATIN1_GENERAL is English (United States), CP1 is code page 1252, CI is case-insensitive, and AI is accent-insensitive. SQL_LATIN1_GENERAL_CP1_CI_AI是排序,其中LATIN1_GENERAL是英语(美国), CP1是代码页1252, CI是不区分大小写的, AI是重音不敏感的。

I know that is not so clean solution, but after reading this I tried something like this: 我知道这是不那么干净的解决方案,但看完之后这个我想是这样的:

var query = this.DataContext.Users.SqlQuery(string.Format("SELECT *  FROM dbo.Users WHERE LastName like '%{0}%' COLLATE Latin1_general_CI_AI", parameters.SearchTerm));

After that you are still able to call methods on 'query' object like Count, OrderBy, Skip etc. 之后,您仍然可以在'query'对象上调用方法,如Count,OrderBy,Skip等。

You could create an SQL Function to remove the diacritics, by applying to the input string the collation SQL_Latin1_General_CP1253_CI_AI, like so:您可以创建一个 SQL 函数来删除变音符号,方法是将排序规则 SQL_Latin1_General_CP1253_CI_AI 应用于输入字符串,如下所示:

CREATE FUNCTION [dbo].[RemoveDiacritics] (
@input varchar(max)
)   RETURNS varchar(max)

AS BEGIN
DECLARE @result VARCHAR(max);

select @result = @input collate SQL_Latin1_General_CP1253_CI_AI

return @result
END

Then add it in the DB context (in this case ApplicationDbContext) by mapping it with the attribute DbFunction:然后通过将其与属性 DbFunction 映射将其添加到 DB 上下文(在本例中为 ApplicationDbContext)中:

public class ApplicationDbContext : IdentityDbContext<CustomIdentityUser>
    {
        [DbFunction("RemoveDiacritics", "dbo")]
        public static string RemoveDiacritics(string input)
        {
            throw new NotImplementedException("This method can only be used with LINQ.");
        }

        public ApplicationDbContext(DbContextOptions<ApplicationDbContext> options)
            : base(options)
        {
        }
}

And Use it in LINQ query, for example:并在 LINQ 查询中使用它,例如:

var query = await db.Users.Where(a => ApplicationDbContext.RemoveDiacritics(a.Name).Contains(ApplicationDbContext.RemoveDiacritics(filter))).tolListAsync();

Accent-insensitive Collation as Stuart Dunkeld suggested is definitely the best solution ... Stuart Dunkeld建议的口音不敏感整理绝对是最好的解决方案......

But maybe good to know: 但也许很高兴知道:

Michael Kaplan once posted about stripping diacritics: Michael Kaplan曾发表关于剥离变音符号的文章:

static string RemoveDiacritics(string stIn)
{
    string stFormD = stIn.Normalize(NormalizationForm.FormD);
    StringBuilder sb = new StringBuilder();

    for(int ich = 0; ich < stFormD.Length; ich++)
    {
        UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(stFormD[ich]);
        if(uc != UnicodeCategory.NonSpacingMark)
        {
            sb.Append(stFormD[ich]);
        }
    }

    return(sb.ToString().Normalize(NormalizationForm.FormC));
}

Source 资源

So your code would be: 所以你的代码是:

myEntities.Items.Where(i => RemoveDiacritics(i.Name).Contains("a")); 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM