简体   繁体   中英

Creating Index in Casbah Issue

My program will do the following (using Casbah):

load2000DocsIntoMongo() 
def myIndexExists= collection.getIndexInfo().exists( x => x.getAs[String] 
         ("name").getOrElse("") == MY_INDEX_NAME)
if (myIndexExists) println("log exists")
else { 
  val start = System.nanoTime()
  collection.ensureIndex(MY_INDEX) 
  println( (System.nanoTime - start) / 1000000000 + "seconds to index")
}

When starting mongod from scratch, and then running my test, the index works. After running the test, I check db.collection.getIndexes() to see if it was created.

However, after running my test once, and then running db.collection.drop() , I re-ran the test. The test inserts the documents correctly, but it incorrectly reports that that index was created. I say this, because even though the X seconds to index was printed out, Mongo shell's db.collection.getIndexes() shows that it was not created.

Why isn't collection.ensureIndex(MY_INDEX) always creating the index if it doesn't exist?

EDIT

When adding an index via collection.ensureIndex(MY_INDEX) , Casbah called the Java library's method to create an index. In this method, a private map variable, _createdIndexes , was updated with this index.

When I had modified Mongo's indexes outside of the Java library, it did not know to update the _createdIndexes variable. As a result, when trying to create the same index, _createdIndexes already had that value, so it simply called return; since the library's cache, ie the variable, already put this index in its map.

To work around this issue, I call collection.dropIndexes() , which will clear the _createdIndexes variable.

Casbah source - https://github.com/mongodb/casbah/blob/master/casbah-core/src/main/scala/MongoCollection.scala

Java source - https://github.com/mongodb/mongo-java-driver/blob/master/src/main/com/mongodb/DBCollection.java

Please see Ross 's detailed answer for the full story.

Its not a bug per sae however, I agree this highlights an issue if you use the Casvah driver and the shell or another driver at the same time.

The underlying java code cache doesn't know what you are doing in the shell and it expects to be the only source of true (other drivers also follow this pattern). The reason there is a cache is to aid performance, so that ensureIndex can be repeatedly called and have little performance impact.

So the question is what is the best course of action in this scenario?

  1. Only use the Casbah driver to create and manage indexes - what ensureIndex relies on
  2. Only use the shell to create and manage indexes - the shell doesnt cache
  3. Don't trust the cache in Casbah code

You could call createIndex and bypass the cache altogether. There is a jira ticket on this: JAVA-667 and it looks for the next major release (3.0) the cache is being removed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM