简体   繁体   中英

Elasticsearch with node.js how to make unique field when insert document to an index

I'm developing search engine to my project and i'm using Elasticsearch and node.js for the server.

Every night I have a parser that scrap data from some website and insert it to the db. For now it duplicates the data that I already have.

Can I make a unique field inside the index when insert a document for example title : {unique : true} and by that it will not insert me a document that come with this title

Here is my code :

async function insertManual(manual) {
  return new Promise(async (resolve, reject) => {
    const result = await client.index({
        index : 'completeindexthree',
        body : {
            brand : manual.brand,
            category : manual.category,
            url : manual.url,
            title : manual.title, // example {unique : true}
            parsingData : new Date().toString()
        }
    })
    await client.indices.refresh({index: 'completeindexthree'})
    resolve(result);
  })
} 

the second is , how can i delete all my duplicates that already got in by title from that index in node.js not from logstach ?

Tldr;

Yes it is possible, not by using the unique keyword though. According to the documentation , if you set an _id and this id exist already it will be replaced/overwrite

If the target is an index and the document already exists, the request updates the document and increments its version.

Furthermore you will find this section

Using _create guarantees that the document is only indexed if it does not already exist.

To fix

You should set an _id per document and use the create

Your code may look like the following:

async function insertManual(manual) {
  return new Promise(async (resolve, reject) => {
    const result = await client.create({
        index : 'completeindexthree',
        id: manual.id,   // <- Here is your unique id.
        body : {
            brand : manual.brand,
            category : manual.category,
            url : manual.url,
            title : manual.title, // example {unique : true}
            parsingData : new Date().toString()
        }
    })
    await client.indices.refresh({index: 'completeindexthree'})
    resolve(result);
  })
} 

If you don't give an id, elastic search creates a unique id, if you do, it creates the id you gave.

payload should be like this

{
 id:"you_unique_id",
 body:{foo,"bar"}

}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM