简体   繁体   中英

Indexing in Vespa is slow

When indexing in local Vespa, the indexing is slow.

My configuration: `

<container id="default" version="1.0">
    <search />
    <document-api />
    <nodes>
        <node hostalias="node1" />
    </nodes>
</container>

<content id="bo" version="1.0">
    <redundancy>1</redundancy>
    <documents>
        <document type="psearch" mode="index" />
    </documents>
    <nodes>
        <node hostalias="node1" distribution-key="0" />
    </nodes>
</content>

`

and schema:

schema psearch {
    document psearch {
        field Id type int {
            indexing: summary | attribute
            attribute: fast-search
        }
        field Name type string {
            indexing: summary | index | attribute
            index: enable-bm25
    }
    field AdId type string {
            indexing: summary | index | attribute
            index: enable-bm25
    }
    field Country type string {
            indexing: summary | index | attribute
            index: enable-bm25
    }
    field Avatar type string {
            indexing: summary | index | attribute
            index: enable-bm25
    }
    field Value type long {
            indexing: summary | attribute
            attribute: fast-search
        }
        field Numbers type int {
            indexing: summary | attribute
            attribute: fast-search
        }
    field BotLastTime type long {
            indexing: summary | attribute
            attribute: fast-search
        }
    field BotDailyCount type int {
            indexing: summary | attribute
            attribute: fast-search
        }
    field Platform type string {
            indexing: summary | index | attribute
            index: enable-bm25
      }
   }

    fieldset default {
        fields: Id, Name, AdId, Country, Avatar, Numbers, BotLastTime, BotDailyCount, Platform
    }

    rank-profile default {
        first-phase {
            expression: nativeRank(Id, Name, AdId, Country, Avatar, Numbers, BotLastTime, BotDailyCount, Platform)
        }
    }
}

I use /document/v1 API to push documents into Vespa (POST to put a given document, by ID) https://docs.vespa.ai/en/reference/document-v1-api-reference.html

On my tests on local Vespa it takes arount 2.3 milliseconds to push one document, in a test where i push 100k documents.

I did the same test wit Elastic search and the average time is around 1.7 milliseconds. I am trying to find a way of getting at least the same performance as in ElasticSearch.

Any idea how can i improve my time on each document push?

Did you try using https://docs.vespa.ai/en/vespa-feed-client.html - this is optimized for throughput, and normally the best client to push indexing load. This question was also asked at https://github.com/vespa-engine/vespa/issues/25715 , where more answers are found

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM