简体   繁体   中英

Count how many times appears different words for each row

I have a mysql table like this:

id         content
-----      ------
1          Big green tree
2          Small green tree
3          Green tree 
4          Small yellow tree
5          Big green lake

I want to count how many times appears different words for each row.

Example: If I search for Big , green and tree . It should return a result like this:

id         count
-----      ------
1          3
2          2
3          2 
4          1
5          2

I tried something like:

SELECT `content`
     , COUNT(*) as count 
  FROM `elements` 
 WHERE `content` LIKE '%Big%' 
    OR `content` LIKE '%green%' 
    OR `content` LIKE '%tree%' 
GROUP 
    BY `id` 
 ORDER BY count DESC;

It does not works because it returns only a row for every match:

id         count
-----      ------
1          1
2          1
3          1 
4          1
5          1

You can use regexp with word boundaries. The match produced is case-insensitive. If a case sensitive match is needed, use REGEXP BINARY .

SELECT `content`, 
CASE WHEN `content` REGEXP '[[:<:]]big[[:>:]]' THEN 1 ELSE 0 END +
CASE WHEN `content` REGEXP '[[:<:]]green[[:>:]]' THEN 1 ELSE 0 END +
CASE WHEN `content` REGEXP '[[:<:]]tree[[:>:]]' THEN 1 ELSE 0 END
       as num_matches        
FROM `elements`
ORDER BY id

Sample Fiddle

Edit: Based on OP's comment, to get rows where num_matches > 0

SELECT * FROM (
SELECT `content`, 
CASE WHEN `content` REGEXP '[[:<:]]big[[:>:]]' THEN 1 ELSE 0 END +
CASE WHEN `content` REGEXP '[[:<:]]green[[:>:]]' THEN 1 ELSE 0 END +
CASE WHEN `content` REGEXP '[[:<:]]tree[[:>:]]' THEN 1 ELSE 0 END
       as num_matches        
FROM `elements`) t
WHERE num_matches > 0

If you don't care about duplicate words within content :

SELECT `content`, 
       ((CASE WHEN `content` LIKE '%Big%' THEN 1 ELSE 0 END) +
        (CASE WHEN `content` LIKE '%green%' THEN 1 ELSE 0 END) +
        (CASE WHEN `content` LIKE '%lake%' THEN 1 ELSE 0 END)
       ) as matches        
FROM `elements`
WHERE `content` LIKE '%Big%' OR
      `content` LIKE '%green%' OR
      `content` LIKE '%tree%'
ORDER BY matches DESC;

If you don't want to use CASE - you can count words like this:

SELECT id, COUNT(*) as count 
  FROM (
     select id from elements WHERE content LIKE '%Big%'
     union all 
     select id from elements WHERE content LIKE '%green%'
     union all 
     select id from elements WHERE content LIKE '%tree%'
  ) as t
GROUP BY id
ORDER BY count DESC;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM