简体   繁体   中英

PostgreSQL full text search weight/priority on searchterms

I am using Full Text Search in PostgreSQL through Django.

I want to associate weights to searchterms. I know it is possible to associate different weights to different fields, but i want to have different weight on searchterms.

Example:

from core.models import SkillName
vector = SearchVector(
    "name",
)
search = SearchQuery("Java") | SearchQuery("Spring")
search_result = (
    SkillName.objects.all()
        .annotate(search=vector)
        .filter(search=search)
        .annotate(rank=SearchRank(vector, search))
        .order_by("-rank")
)
for s in search_result.distinct():
    print(f"{s} rank: {s.rank}")

And now i want "Java" to be more important than "Spring" and get ranking accordingly. I guess i could do 2 different searches and multiply the ranks with factors, but is there a better way?

Is it really that weird to want to associate different priority to searchterms?

Generated SQL for reference, i honestly dont think this is possible in Django right now anyway and we might need the help of a PostgreSQL-guru.

SELECT DISTINCT "core_skillname"."id",
                "core_skillname"."name",
                to_tsvector(COALESCE("core_skillname"."name", '')) AS "search",
                ts_rank(to_tsvector(COALESCE("core_skillname"."name", '')), (plainto_tsquery('Java') || plainto_tsquery('Spring'))) AS "rank"
FROM "core_skillname"
WHERE to_tsvector(COALESCE("core_skillname"."name", '')) @@ (plainto_tsquery('Java') || plainto_tsquery('Spring'))
ORDER BY "rank" DESC;```

Applying the ranks with weights doesn't require two queries, just two sub-expressions in the same query.

SELECT DISTINCT "core_skillname"."id",
                "core_skillname"."name",
                to_tsvector(COALESCE("core_skillname"."name", '')) AS "search",
                ts_rank(to_tsvector(COALESCE("core_skillname"."name", '')), plainto_tsquery('Spring')) +
                ts_rank(to_tsvector(COALESCE("core_skillname"."name", '')), plainto_tsquery('Java')) * 1.5 AS "rank"
FROM "core_skillname"
WHERE to_tsvector(COALESCE("core_skillname"."name", '')) @@ (plainto_tsquery('Java') || plainto_tsquery('Spring'))
ORDER BY "rank" DESC;

Since it is so easy to scratch your own itch this way, why invent some other mechanism to do it? When the weights are part of the table, not part of the query, you couldn't really do it this way, so its own mechanism makes more sense.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM