简体   繁体   中英

Strange result searching with to_tsquery under Postgresql

I got a strange result searching for an expression like pro-physik.de with tsquery.

If I ask for pro-physik:* by tsquery I want to get all entries starting with pro-physik . Unfortunately those entries with pro-physik.de are missing.

Here are 2 examples to demonstrate the problem:

Query 1:

select 
    to_tsvector('simple', 'pro-physik.de') @@ 
    to_tsquery('simple', 'pro-physik:*') = true

Result 1: false (should be true )

Query 2:

select 
    to_tsvector('simple', 'pro-physik.de') @@
    to_tsquery('simple', 'pro-p:*') = true

Result 2: true

Has anybody an idea how I could solve this problem?

The core of the problem is that the parser will parse pro-physik.de as a hostname:

SELECT alias, token FROM ts_debug('simple', 'pro-physik.de');

 alias |     token
-------+---------------
 host  | pro-physik.de
(1 row)

Compare this:

SELECT alias, token FROM ts_debug('simple', 'pro-physik-de');
      alias      |     token
-----------------+---------------
 asciihword      | pro-physik-de
 hword_asciipart | pro
 blank           | -
 hword_asciipart | physik
 blank           | -
 hword_asciipart | de
(6 rows)

Now pro-physik and pro-p are not hostnames, so you get

SELECT to_tsquery('simple', 'pro-physik:*');
              to_tsquery
---------------------------------------
 'pro-physik':* & 'pro':* & 'physik':*
(1 row)

SELECT to_tsquery('simple', 'pro-p:*');
         to_tsquery
-----------------------------
 'pro-p':* & 'pro':* & 'p':*
(1 row)

The first tsquery will not match because physik is not a prefix of pro-physik.de , and the second will match because pro-p , pre and p all three are prefixes.

As a workaround, use full text search like this:

select 
   to_tsvector('simple', replace('pro-physik.de', '.', ' ')) @@ 
   to_tsquery('simple', replace('pro-physik:*', '.', ' '))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM