简体   繁体   中英

How to share database created by MongoDB?

Our current Python pipeline scrapes from the web and stores those data into the MongoDB. After that we load the data into an analysis algorithm. This works well on a local computer since mongod locates the database, but I want to upload the database on sharing platform like Google Drive so that other users can use the data without having to run the scraper again.

I know that MongoDB stores data at /data/db as default, so could I upload the entire /data/db onto the Google Drive?

Another option seems to be exporting MongoDB into JSON or CSV, but our current implementation for the analysis algorithm already loads directly from MongoDB.

Yes, you can upload the /data directory, that is one way to backup the database. You can also use mongodump with --gzip or mongoexport as your pointed out yourself.

If you wish to do the backup regularly then you can cp/rsync the /data directory on regular basis. You can also bash script mongodump/mongoresore and mongoexport/mongoimport to backup database on regular basis or use mongolab as recommended by other answers.

So you have three options then,

  1. Using mongodump and mongorestore
  2. You can create backup by copying the underlying content of /data directory (cp/rsync this directory if you want regular backups)
  3. mongoexport and mongoimport (read here before using this)

Using mongodump and restore

In version 3.x you simply run (dumps default mongodb instance with default port)

mongodump

Earlier versions you need to specify --dbPath

The above mongodump command creates a dump directory inside which it will create sub directories for each database inside mongodb. If you wish to dump a specific collection (name=collection) then something as follow would be useful

mongodump  --db test --collection collection

You can use the --gzip and other similar command line options. For more details and extra command line options read here .

You can restore a dumped database using mongorestore and the command is as follows

mongorestore --dir <path> 

Just like mongodump you can specify a hostname, port number (if diff.), a db name etc etc read here for more information.

Using mongoexport and mongoimport

Allows importing in JSON or CSV formats. Not recommended for full backup of prod, see here . To export you run a command with one or many options as follow (specify the db name and collection you wish to backup -- default to JSON but, if you wish to import to CSV then --type=csv )

mongoexport --db threads --collection messages --out messages.json

You can import a backed up collection to mongodb using mongoimport as follow

mongoimport --db threads --collection message --file messages.json

See here for more options, specially if you want to export a result of a query.

You can create a little Rest API for your database with unique keys and all peoples in your team will can use it.

If you want to use export only one time - just export it to JSON and no problem.

You could run a MongoDB instance in the cloud. You could for example use MongoLab ( https://mongolab.com/ ) or install your own instance on a VM in the cloud and use one of the cloud providers like Microsoft Azure, Amazon AWS or Google Compute engine. Alternatively you could create a REST API as proposed by JRazor, however, this will require more development work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM