简体   繁体   English

Google App Engine搜索API

[英]Google App Engine Search API

When querying a search index in the Python version of the GAE Search API , what is the best practice for searching for items where documents with words match the title are first returned, and then documents where words match the body? GAE Search API的Python版本中查询搜索索引时,首先返回的是搜索具有与标题匹配的文档的项目的最佳实践,然后返回与正文相匹配的文档?

For example given: 例如给出:

body = """This is the body of the document, 
with a set of words"""

my_document = search.Document(
  fields=[
    search.TextField(name='title', value='A Set Of Words'),
    search.TextField(name='body', value=body),
   ])

If it is possible, how might one perform a search on an index of Document s of the above form with results returned in this priority, where the phrase being searched for is in the variable qs : 如果可能的话,如何对上述表单的Document s的索引执行搜索,并在此优先级中返回结果,其中搜索的短语位于变量qs

  1. Documents whose title matches the qs ; titleqs匹配的文件; then 然后
  2. Documents whose body match the qs words. 身体与qs单词匹配的文档。

It seems like the correct solution is to use a MatchScorer , but I may be off the mark on this as I have not used this search functionality before. 看起来正确的解决方案是使用MatchScorer ,但我可能会因为我之前没有使用过此搜索功能而对此MatchScorer It is not clear from the documentation how to use the MatchScorer , but I presume one subclasses it and overloads some function - but as this is not documented, and I have not delved into the code, I cannot say for sure. 从文档中不清楚如何使用MatchScorer ,但我假设它有一个子类并重载某些函数 - 但由于这没有记录,我没有深入研究代码,我不能肯定地说。

Is there something here that I am missing, or is this the correct strategy? 这里有什么我想念的,或者这是正确的策略? Did I miss where this sort of thing is documented? 我是否想念记录这类事情的地方?


Just for clarity here is a more elaborate example of the desired outcome: 为了清楚起见,这是一个更详细的预期结果的例子:

documents = [
  dict(title="Alpha", body="A"),          # "Alpha"
  dict(title="Beta", body="B Two"),       # "Beta"
  dict(title="Alpha Two", body="A"),      # "Alpha2"
]

for doc in documents: 
  search.Document(
    fields=[
       search.TextField(name="title", value=doc.title),
       search.TextField(name="body", value=doc.body),
    ]
  )
  index.put(doc)  # for some search.Index

# Then when we search, we search the Title and Body.
index.search("Alpha")
# returns [Alpha, Alpha2]

# Results where the search is found in the Title are given higher weight.
index.search("Two")
# returns [Alpha2, Beta]  -- note Alpha2 has 'Two' in the title.

Custom scoring is one of our top priority feature requests. 自定义评分是我们的首要功能要求之一。 We're hoping to have a good way to do this sort of thing as soon as possible. 我们希望有一个很好的方法尽快做到这一点。

In your particular case, you could of course achieve the desired result by doing two separate queries: the first one with field restriction on "title", and the second restricted on "body". 在您的特定情况下,您当然可以通过执行两个单独的查询来实现所需的结果:第一个对“标题”进行字​​段限制,第二个对“正文”进行限制。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM