简体   繁体   English

Mysql德语口音在全文搜索中不敏感搜索

[英]Mysql german accents not-sensitive search in full-text searches

Let`s have a example hotels table: 我们有一个示例酒店表:

CREATE TABLE `hotels` (
  `HotelNo` varchar(4) character set latin1 NOT NULL default '0000',
  `Hotel` varchar(80) character set latin1 NOT NULL default '',
  `City` varchar(100) character set latin1 default NULL,
  `CityFR` varchar(100) character set latin1 default NULL,
  `Region` varchar(50) character set latin1 default NULL,
  `RegionFR` varchar(100) character set latin1 default NULL,
  `Country` varchar(50) character set latin1 default NULL,
  `CountryFR` varchar(50) character set latin1 default NULL,
  `HotelText` text character set latin1,
  `HotelTextFR` text character set latin1,
  `tagsforsearch` text character set latin1,
  `tagsforsearchFR` text character set latin1,
  PRIMARY KEY  (`HotelNo`),
  FULLTEXT KEY `fulltextHotelSearch` (`HotelNo`,`Hotel`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`,`HotelText`,`HotelTextFR`,`tagsforsearch`,`tagsforsearchFR`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_german1_ci;

In this table for example we have only one hotel with Region name = "Graubünden" (please note umlaut ü character) 以此表为例,我们只有一家酒店,地区名称=“Graubünden”(请注意umlautü字符)

And now I want to achieve same search match for phrases: 'graubunden' and 'graubünden' 现在我想为短语实现相同的搜索匹配:'graubunden'和'graubünden'

This is simple with use of MySql built in collations in regular searches as follows: 在常规搜索中使用内置排序规则的MySql,这很简单,如下所示:

SELECT *  
FROM `hotels` 
WHERE `Region` LIKE CONVERT(_utf8 '%graubunden%' USING latin1) 
COLLATE latin1_german1_ci

This works fine for 'graubunden' and 'graubünden' and as a result I receive proper result, but problem is when we make MySQL full text search 这适用于'graubunden'和'graubünden',因此我收到了正确的结果,但问题是当我们进行MySQL全文搜索

Whats wrong with this SQL statement?: 这个SQL语句怎么了?:

SELECT 
 *
FROM 
 hotels 
WHERE 
 MATCH (`HotelNo`,`Hotel`,`Address`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`, `HotelText`, `HotelTextFR`, `tagsforsearch`, `tagsforsearchFR`)
AGAINST( CONVERT('+graubunden' USING latin1)  COLLATE latin1_german1_ci IN BOOLEAN MODE)            
ORDER BY Country ASC, Region ASC, City ASC

This doesn`t return any result. 这不会返回任何结果。 Any ideas where the dog is buried ? 狗埋葬的任何想法?

When you define individual CHARACTER SETS for your columns, you override the collation you set default on table level. 为列定义单个CHARACTER SETS ,将覆盖在表级别上设置的默认排序规则。

Each of your columns has default latin1 collation (which is latin1_swedish_ci ). 每个列都有默认的latin1归类(即latin1_swedish_ci )。 You can see it by running SHOW CREATE TABLE . 您可以通过运行SHOW CREATE TABLE来查看它。

In FULLTEXT queries, indexed columns have COERCIBILITY of 0 , that is all fulltext queries are converted to the collation used in the index, not vice versa. FULLTEXT查询中,索引列的COERCIBILITY0 ,即所有全文查询都转换为索引中使用的排序规则,反之亦然。

You need to remove CHARACTER SET definitions from your columns or explicitly set all columns to latin1_german_ci : 您需要从列中删除CHARACTER SET定义或将所有列显式设置为latin1_german_ci

CREATE TABLE `hotels` (
  `HotelNo` varchar(4) NOT NULL default '0000',
  `Hotel` varchar(80) NOT NULL default '',
  `City` varchar(100) default NULL,
  `CityFR` varchar(100) default NULL,
  `Region` varchar(50) default NULL,
  `RegionFR` varchar(100) default NULL,
  `Country` varchar(50) default NULL,
  `CountryFR` varchar(50) default NULL,
  `HotelText` text,
  `HotelTextFR` text,
  `tagsforsearch` text,
  `tagsforsearchFR` text,
  PRIMARY KEY  (`HotelNo`),
  FULLTEXT KEY `fulltextHotelSearch` (`HotelNo`,`Hotel`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`,`HotelText`,`HotelTextFR`,`tagsforsearch`,`tagsforsearchFR`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_german1_ci;

INSERT
INTO    hotels (hotelText, HotelTextFR, tagsforsearch, tagsforsearchFR)
VALUES  ('text', 'text', 'graubünden', 'tags');

SELECT  *
FROM    hotels
WHERE   MATCH (`HotelNo`,`Hotel`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`, `HotelText`, `HotelTextFR`, `tagsforsearch`, `tagsforsearchFR`)
AGAINST (CONVERT('+graubunden' USING latin1) COLLATE latin1_german1_ci IN BOOLEAN MODE)
ORDER BY
        Country ASC, Region ASC, City ASC;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM