为什么 Vepsa 容易警告我“这可能会导致召回和排名问题。”

Question

I am doing parent-child mode, but when i deploy, vespa warn lots of my fields about "This may lead to recall and ranking issues."我正在使用父子模式，但是当我部署时，vespa 会警告我的许多字段“这可能会导致召回和排名问题”。

finally I found that, i easy the problem to just one sd file最后我发现，我把问题简单化为一个 sd 文件

search vp001 {
    document vp001 {
        field saleno type string {
            indexing: attribute | summary
            attribute: fast-search
        }
        
        field salename type string {
            indexing: attribute | summary
            attribute: fast-search
        }
    }
    
    fieldset default {
        fields: saleno, salename
    }
}

if I change saleno type from string to int, vespa will warning me again如果我将 saleno 类型从 string 更改为 int，vespa 会再次警告我

Uploading application package ... done

Success: Deployed myproject
WARNING For schema 'vp001', field 'saleno': The matching settings for the fields in fieldset 'default' are inconsistent (explicitly or because of field type). This may lead to recall and ranking issues.
WARNING For schema 'vp001', field 'saleno': The normalization settings for the fields in fieldset 'default' are inconsistent (explicitly or because of field type). This may lead to recall and ranking issues.
WARNING For schema 'vp001', field 'saleno': The stemming settings for the fields in the fieldset 'default' are inconsistent (explicitly or because of field type). This may lead to recall and ranking issues.

Why ?为什么？ Do I need to make sure all fileds in filedset to be the same type ?我是否需要确保文件集中的所有文件都是相同的类型？ And I found that I cannot mixed "index" and "attribute" filed in fieldset, if i do so, the warning will appear too.而且我发现我不能在字段集中混合“索引”和“属性”，如果我这样做，也会出现警告。

If not the same type, the warning appear, what ranking issues will happen ?如果不是同一类型，会出现警告，会出现什么排名问题？

Answer 1

Yes, you see this warning whenever you put fields with different kinds of tokenization in the same fieldset.是的，每当您将具有不同类型标记化的字段放入同一字段集中时，您都会看到此警告。 This is because a given piece of text searching one fieldset is tokenized just once, so there's no right choice of tokenization in this case.这是因为搜索一个字段集的给定文本仅被标记化一次，因此在这种情况下没有正确的标记化选择。

Attributes and text indexes are tokenized differently (exact match vs. tokenized match), so you'll see this then.属性和文本索引的标记化方式不同（精确匹配与标记化匹配），因此您会看到这一点。

In most cases you know whether a given text should match an unstructured text field (a string index field), or some structured data, so doing this is just an error.在大多数情况下，您知道给定文本是否应该匹配非结构化文本字段（字符串索引字段）或某些结构化数据，所以这样做只是一个错误。 Otherwise, you need to use query expansion instead of a fieldset: Expand the query to search these fields separately, either on the client side, or in a Searcher component.否则，您需要使用查询扩展而不是字段集：扩展查询以单独搜索这些字段，无论是在客户端还是在 Searcher 组件中。

为什么 Vepsa 容易警告我“这可能会导致召回和排名问题。”

问题描述

1 个解决方案

解决方案1
2 2022-06-28 10:24:28

为什么 Vepsa 容易警告我“这可能会导致召回和排名问题。”

问题描述

1 个解决方案

解决方案1 2 2022-06-28 10:24:28

解决方案1
2 2022-06-28 10:24:28