简体   繁体   English

Postgres查询以基于模糊匹配的记录返回集合中的值

[英]Postgres query to return records based fuzzy match with values in a set

As seen below, I have two tables, one containing information about people,and the other containing information about sports. 如下所示,我有两个表,一个包含有关人的信息,另一个包含有关体育的信息。

I want to do an query on the people table and only return records where the description contain a sport listed in the cost table. 我想对人员表进行查询,仅返回描述包含成本表中列出的运动的记录。 If description only contained the sport, and no other text, I could easily do this as an inner join after zapping everything into lowercase. 如果描述仅包含运动,而没有其他文本,则在将所有内容都转换为小写字母之后,我可以轻松地将其作为内部联接进行。 However, I'm thinking because of the additional information in description, I might need to do something with a subquery and/or regular expression. 但是,由于考虑到描述中的其他信息,我在想,我可能需要对子查询和/或正则表达式执行某些操作。

 name  | age |               description                
-------+-----+------------------------------------------
 bill  |  15 | I like to play soccer
 bob   |  20 | In my free time, I like to play BASEBALL
 jim   |  25 | I play video games everyday!!
 tony  |  30 | Im a really big fan of Hockey!!
 sandy |  35 | I could play soccer and hockey everyday


  sport   | cost 
----------+------
 soccer   |  100
 baseball |  150
 hockey   |  200

Ultimately, this query would return the following table, which does not include jim, as none of the words in his description were in the sport column in the cost table. 最终,此查询将返回下表,该表不包含吉姆,因为其描述中的所有单词均不在成本表的“体育”列中。 Some times the sports might be one word, other times they might be multiple words. 有时候,运动可能是一个字,而有时他们可能是多个字。 If the sports contain multiple words, I want all of those words to be present together in the description for it to be returned. 如果运动中包含多个单词,我希望所有这些单词在说明中一起出现,以便返回。

 name  | age |               description                
-------+-----+------------------------------------------
 bill  |  15 | I like to play soccer
 bob   |  20 | In my free time, I like to play BASEBALL
 tony  |  30 | Im a really big fan of Hockey!!
 sandy |  35 | I could play soccer and hockey everyday

I know that I could do this individually for each sport, but I'm hoping there is a better way to do this. 我知道我可以针对每种运动单独进行此操作,但我希望有更好的方法可以执行此操作。

SELECT *
FROM person
WHERE lower(description) LIKE '%hockey%';

 name  | age |               description               
-------+-----+-----------------------------------------
 tony  |  30 | Im a really big fan of Hockey!!
 sandy |  35 | I could play soccer and hockey everyday

CODE TO CREATE THE TABLES BELOW 创建下面的表的代码


CREATE TABLE person (name VARCHAR(10), age INT, description VARCHAR(100));
INSERT INTO person (name, age, description) VALUES ("bill", 15, "I like to play soccer")
INSERT INTO person (name, age, description) VALUES ("bob", 20, "In my free time, I like to play BASEBALL")
INSERT INTO person (name, age, description) VALUES ("jim", 25, "I play video games everyday!!")
INSERT INTO person (name, age, description) VALUES ("tony", 30, "Im a really big fan of Hockey!!")
INSERT INTO person (name, age, description) VALUES ("sandy", 35, "I could play soccer and hockey everyday")

CREATE TABLE cost (sport VARCHAR(10), cost INT);
INSERT INTO cost (sport, cost) VALUES ('soccer', 100);
INSERT INTO cost (sport, cost) VALUES ('baseball', 150);
INSERT INTO cost (sport, cost) VALUES ('hockey', 200);

You can use joins: 您可以使用联接:

SELECT DISTINCT p.name,p.age,p.description
FROM person p
  JOIN cost c ON p.description LIKE '%'||c.sport||'%'

DISTINCT is necessary to avoid getting two rows for Sandy. DISTINCT是必要的,以避免Sandy获得两行。

Alternatively, you can use EXISTS and a subquery: 另外,您可以使用EXISTS和一个子查询:

SELECT p.name,p.age,p.description
FROM person p
WHERE EXISTS (
  SELECT 1
  FROM cost c
  WHERE p.description LIKE '%'||c.sport||'%')

EXISTS checks whether the subquery returns at least one row, so it's irrelevant, what to select in the subquery. EXISTS检查子查询是否返回至少一行,因此与在子查询中选择什么无关。 So why not 1? 那么为什么不1?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM