简体   繁体   中英

Sphinx Search / MySQL find most common words

I have a sphinx search index and would like to find what are the most common words in my index. Ideally have a list of words ordered by frequency.

If it is not possible to do it using Sphinx, is there a way to query a mysql table's text fields to get the same stat?

Yes. This is quite simple. Build them with indexer using the --buildstops and --buildfreqs flags.

indexer --config /path/to/sphinx.conf  indexName --buildfreqs --buildstops freq_wordlist.txt 100000

This example gives you the first 100000 word in your sphinx index ordered by its frequency

Build them with indexer using the --buildstops and --buildfreqs flags.

Just be aware this does not build from the existing index, but runs against the data source as if indexing, and builds the word frequencies. It does not affect the index itself.

If you use delta indexes where you save the id of the last indexed document, this will read the last saved id and work from there.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM