如何在Scala中从mongodB获取数据

Question

I wrote following code to fetch data from MongoDB 我编写了以下代码以从MongoDB中获取数据

import com.typesafe.config.ConfigFactory
import org.mongodb.scala.{ Document, MongoClient, MongoCollection, MongoDatabase }

import scala.concurrent.ExecutionContext

object MongoService extends Service {
  val conf = ConfigFactory.load()
  implicit val mongoService: MongoClient = MongoClient(conf.getString("mongo.url"))
  implicit val mongoDB: MongoDatabase = mongoService.getDatabase(conf.getString("mongo.db"))
  implicit val ec: ExecutionContext = ExecutionContext.global

  def getAllDocumentsFromCollection(collection: String) = {
    mongoDB.getCollection(collection).find()
  }
}

But when I tried to get data from getAllDocumentsFromCollection I'm not getting each data for further manipulation. 但是，当我尝试从getAllDocumentsFromCollection获取数据时， getAllDocumentsFromCollection没有获取每个数据以进行进一步处理。 Instead I'm getting 相反，我越来越

FindObservable(com.mongodb.async.client.FindIterableImpl@23555cf5)

UPDATED: 更新：

object MongoService {
  // My settings (see available connection options)
  val mongoUri = "mongodb://localhost:27017/smsto?authMode=scram-sha1"

  import ExecutionContext.Implicits.global // use any appropriate context

  // Connect to the database: Must be done only once per application
  val driver = MongoDriver()
  val parsedUri = MongoConnection.parseURI(mongoUri)
  val connection = parsedUri.map(driver.connection(_))

  // Database and collections: Get references
  val futureConnection = Future.fromTry(connection)
  def db1: Future[DefaultDB] = futureConnection.flatMap(_.database("smsto"))
  def personCollection = db1.map(_.collection("person"))

  // Write Documents: insert or update

  implicit def personWriter: BSONDocumentWriter[Person] = Macros.writer[Person]
  // or provide a custom one

  def createPerson(person: Person): Future[Unit] =
        personCollection.flatMap(_.insert(person).map(_ => {})) // use personWriter
  def getAll(collection: String) =
    db1.map(_.collection(collection))

  // Custom persistent types
  case class Person(firstName: String, lastName: String, age: Int)
}

I tried to use reactivemongo as well with above code but I couldn't make it work for getAll and getting following error in createPerson 我试图在上面的代码中也使用createPerson但是我无法使其适用于getAll并在createPerson得到以下错误 Please suggest how can I get all data from a collection. 请提出如何从集合中获取所有数据的建议。

Answer 1

This is likely too late for the OP, but hopefully the following methods of retrieving & iterating over collections using mongo-spark can prove useful to others. 对于OP来说，这可能为时已晚，但是希望以下使用mongo-spark检索和迭代集合的方法可以证明对其他人有用。

The Asynchronous Way - Iterating over documents asynchronously means you won't have to store an entire collection in-memory, which can become unreasonable for large collections. 异步方式 -异步遍历文档意味着您不必将整个集合存储在内存中，这对于大型集合可能变得不合理。 However, you won't have access to all your documents outside the subscribe code block for reuse. 但是，您将无法访问subscribe代码块之外的所有文档以进行重用。 I'd recommend doing things asynchronously if you can, since this is how the mongo-scala driver was intended to be used. 如果可以的话，我建议异步执行操作，因为这是打算使用mongo-scala驱动程序的方式。

db.getCollection(collectionName).find().subscribe(
    (doc: org.mongodb.scala.bson.Document) => {
        // operate on an individual document here
    },
    (e: Throwable) => {
        // do something with errors here, if desired
    },
    () => {
        // this signifies that you've reached the end of your collection
    }
)

The "Synchronous" Way - This is a pattern I use when my use-case calls for a synchronous solution, and I'm working with smaller collections or result-sets. “同步”方式 -这是我的用例要求同步解决方案时使用的一种模式，并且正在使用较小的集合或结果集。 It still uses the asynchronous mongo-scala driver, but it returns a list of documents and blocks downstream code execution until all documents are returned. 它仍然使用异步mongo-scala驱动程序，但是它返回文档列表并阻止下游代码执行，直到返回所有文档。 Handling errors and timeouts may depend on your use case. 处理错误和超时可能取决于您的用例。

import org.mongodb.scala._
import org.mongodb.scala.bson.Document
import org.mongodb.scala.model.Filters
import scala.collection.mutable.ListBuffer

/* This function optionally takes filters if you do not wish to return the entire collection.
 * You could extend it to take other optional query params, such as org.mongodb.scala.model.{Sorts, Projections, Aggregates}
 */
def getDocsSync(db: MongoDatabase, collectionName: String, filters: Option[conversions.Bson]): ListBuffer[Document] = {
    val docs = scala.collection.mutable.ListBuffer[Document]()
    var processing = true
    val query = if (filters.isDefined) {
        db.getCollection(collectionName).find(filters.get)
    } else {
        db.getCollection(collectionName).find()
    }
    query.subscribe(
        (doc: Document) => docs.append(doc), // add doc to mutable list
        (e: Throwable) => throw e,
        () => processing = false
    )
    while (processing) {
        Thread.sleep(100) // wait here until all docs have been returned
    }
    docs
}

// sample usage of 'synchronous' method
val client: MongoClient = MongoClient(uriString)
val db: MongoDatabase = client.getDatabase(dbName)
val allDocs = getDocsSync(db, "myCollection", Option.empty)
val someDocs = getDocsSync(db, "myCollection", Option(Filters.eq("fieldName", "foo")))

如何在Scala中从mongodB获取数据

问题描述

1 个解决方案

解决方案1
0 2019-10-18 15:26:12

如何在Scala中从mongodB获取数据

问题描述

1 个解决方案

解决方案1 0 2019-10-18 15:26:12

解决方案1
0 2019-10-18 15:26:12