Inserting an element into an Array in an embedded document in MongoDB

Question

I quite new with MongoDB and am working with it in my Java project.

I have the folloing document structure in my collection:

{ "_id":"ProcessX", "tasks":[ { "taskName":"TaskX", "taskTime":"2018-08-09T13:38:58.317Z", "crawledList":[ "http://dbpedia.org/ontology/birthYear" ] }, { "taskName":"TaskX", "taskTime":"2018-08-10T06:19:32.006Z", "crawledList":[ "http://dbpedia.org/ontology/birthYear", "http://dbpedia.org/page/Mo_Chua_of_Balla" ] }, { "taskName":"TaskY", "taskTime":"2018-08-10T06:21:58.737Z", "crawledList":[ "http://dbpedia.org/page/Mo_Chua_of_Balla" ] } ] }

I want to put a "newURI" into a task's crawledList if it does not exists. Here is the process:

Find the process document with _id = "someProcessName"
Find the task document, in tasks array, with taskName = "someTaskName" and taskTime = "someTaskTime"
Check if the "newURI" exists in the crawledList of that task document
If it does not exists, insert the newURI into crawledList of the task document

I don't want to retrieve documents into memory and work with primitive Java types (Lists etc.) Can you help me to write the most efficient code by using MongoDB's Java Driver commands?

I don't have any indexes defined because I don't know which indexes I should define. I can also change the document structure if there is a better way to represent them and do this operation faster.

Thank you in advance.

Answer 1

So in the end, by reading the Java Driver document and searching on the web, I managed to achieve it with the following two functions:

public boolean crawledBefore(IRI iri) {
    return collection.countDocuments(
            and(eq("_id", CrawlProcess.getProcessName()),
                    elemMatch("tasks", and(eq("taskName", CrawlProcess.getTaskName()),
                                            eq("taskTime", CrawlProcess.getCreationTime()),
                                            in("crawledList",iri.toString()))))) != 0;
}

public void addToStore(IRI iri) {
    if(!crawledBefore(iri)) {
        collection.updateOne(
                and(eq("_id", CrawlProcess.getProcessName()),
                    elemMatch("tasks", and(eq("taskName", CrawlProcess.getTaskName()),
                                            eq("taskTime", CrawlProcess.getCreationTime())))), 
                push("tasks.$.crawledList",iri.toString()));        
    }
}

Here is how it works:

crawledBefore() function takes an IRI and looks if there exists any document; that has that IRI in the crawledList array of IRIs inside a task document that is an embedded document inside the process document. Such process document with given process name, task name and time always exists in my collection, what I'm checking here is only if the IRI exists in that document.

If so, second function adds the new IRI to the crawledList of that specific task document inside the process document.

Cheers.

Inserting an element into an Array in an embedded document in MongoDB

Question

1 answers

solution1
0 2018-08-23 12:02:02

Inserting an element into an Array in an embedded document in MongoDB

Question

1 answers

solution1 0 2018-08-23 12:02:02

solution1
0 2018-08-23 12:02:02