简体   繁体   English

存储时间序列数据时,以_id作为日期是一个好主意吗?

[英]Is it a good idea to date as _id when storing time-series data?

I'm new to MongoDB. 我是MongoDB的新手。 I'm writing a python script to scrape and update stock quotes data. 我正在编写一个Python脚本来抓取和更新股票报价数据。 The script will run once to scrape and build a database up to the latest and then run everyday for updating. 该脚本将运行一次,以刮取并构建最新的数据库,然后每天运行以进行更新。

After some researching, I think MongoDb fits the bill. 经过一些研究,我认为MongoDb符合要求。 Currently, I'm setting up the date as '_id' because I want to ensure uniqueness (since the update also scrapes from a page containing data from previous days). 目前,我将日期设置为“ _id”,因为我想确保其唯一性(因为更新也会从包含前几天数据的页面中抓取到)。

Is it a potential disastrous idea? 这是一个潜在的灾难性想法吗? If so, how should I do otherwise? 如果是这样,我应该怎么做? Thanks 谢谢

No, It's not a good idea, because, by default, MongoDB already saves the timestamp in the _id: 不,这不是一个好主意,因为默认情况下,MongoDB已将时间戳保存在_id中:

You can retrieve the _id data using this code: 您可以使用以下代码检索_id数据:

date = new Date( parseInt( _id.toString().substring(0,8), 16 ) * 1000 )

I'd use the auto-generated MongoDB _id 我会使用自动生成的MongoDB _id

EDIT: (Brought from comments) If you are using PyMongo, the objectid python object has the attribute generation_time from which you can extract the related datetime.datetime instance . 编辑:(从注释中提取)如果您使用的是PyMongo,则objectid python对象具有属性generation_time ,可以从中提取相关的datetime.datetime实例。 PyMongo API Doc PyMongo API文档

>>> ObjectId().generation_time

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM