简体   繁体   中英

Find the most common word in a sql column (with multiple words)

For example, I have this column in sqlite3:

hello world
hello you two
hello world
hello hello

I want to extract the most popular word and its occurences. However, untill now, it seems its only possible to find the occurences of a cell. Like this:

SELECT titles, COUNT(titles) 
FROM standart_results
GROUP BY titles
ORDER BY count(*) DESC

It will return ("hello world", 2) . But I want ("hello", 5) .

I can not use LIKE as well since I do not know what word has the most occurences.

Do I need to transfer the data into a variable and use regex on it or can I do it with sql?

SQLite doesn't have really good string processing capabilities and no method for returning a table. However, it does support recursive CTEs. And you can use this construct to break the title into words:

with recursive cte as (
      select null as word, title || ' ' as rest, 0 as lev
      from t
      union all
      select substr(rest, 1, instr(rest, ' ') - 1) as word, 
             substr(rest, instr(rest, ' ') + 1) rest,
             lev + 1
      from cte
      where lev < 5 and rest like '% %'
     )
select word, count(*)
from cte
where word is not null
group by word;

Here is a db<>fiddle.

To get the top word, you can use:

select word, count(*)
from cte
where word is not null
group by word
order by count(*) desc
limit 1;

You can try this.

SELECT titles, COUNT(titles) as Appearances
FROM standart_results
GROUP BY titles
ORDER BY Appearances DESC LIMIT 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM