简体   繁体   English

在 MongoDb 中存储 IOT 数据

[英]Storing IOT data in MongoDb

I am currently streaming IOT data to my MongoDB which is running in a Docker Container(hosted in AWS).我目前正在将 IOT 数据流式传输到我的 MongoDB,它在 Docker 容器(托管在 AWS)中运行。 Per day I am getting a couple of thousands of data points.每天我都会获得数千个数据点。

I will be using this data gathered for some intensive data analysis and ML which will run on day to day basis.我将使用收集到的这些数据进行一些密集的数据分析和机器学习,这些数据将每天运行。

So is this how normally how big data is stored?那么这就是大数据的正常存储方式吗? What are the industrial standards and best practices?行业标准和最佳实践是什么?

It depends on a lot of factors, for example, the type of data one is analyzing, how much data one has and how quickly you need it.这取决于很多因素,例如,正在分析的数据类型、拥有多少数据以及您需要它的速度。

  • For applications such as user behavior analysis, relational DB is best.对于用户行为分析等应用,关系型数据库是最好的。
  • Well, if the data fits into a spreadsheet, then it is better suited for a SQL-type database such as Postgres, BigQuery as relational databases are good at analyzing data in rows and columns.好吧,如果数据适合电子表格,那么它更适合 SQL 类型的数据库,例如 Postgres、BigQuery,因为关系数据库擅长分析行和列中的数据。
  • For semi-structured data, think social media, texts or geographical data which requires a large amount of text mining or image processing, NoSQL type database such as MongoDB, CouchDB works best.对于半结构化数据,考虑需要大量文本挖掘或图像处理的社交媒体、文本或地理数据,NoSQL 类型的数据库,例如 MongoDB,CouchDB 效果最好。
  • On the other hand, in relational databases, one can use SQL to query them.另一方面,在关系数据库中,可以使用 SQL 来查询它们。 SQL as a language is well-known among data analysts and engineers and is also easy to learn than most programming languages. SQL 作为一种语言在数据分析师和工程师中广为人知,并且比大多数编程语言更容易学习。

Databases that commonly used in the industry to store Big Data are:业界常用来存储大数据的数据库有:

  • Relational Database Management System: As data engine storage, the platform employs the B-Tree structure.关系型数据库管理系统:平台作为数据引擎存储,采用B-Tree结构。 B-Tree concepts are used to organize the index and data, and logarithmic time is used to write and read the data. B-Tree概念用于组织索引和数据,对数时间用于写入和读取数据。
  • MongoDB: You can use this platform if you need to de-normalize tables. MongoDB:如果需要对表进行反规范化,可以使用此平台。 It is apt if you want to resort to documents that comprise all the allied nested structures in a single document for maintaining consistency.如果您想求助于在单个文档中包含所有相关嵌套结构的文档以保持一致性,那么它是合适的。
  • Cassandra: This database platform is perfect for upfront queries and fast writing. Cassandra:这个数据库平台非常适合前期查询和快速编写。 However, the query performance is slightly less, and that makes it ideal for Time-Series data.但是,查询性能稍差,这使其成为时间序列数据的理想选择。 Cassandra uses the Long-Structured-Merge-Tree format in the storage engine. Cassandra 在存储引擎中使用 Long-Structured-Merge-Tree 格式。
  • Apache HBase: This data management platform has similarities with Cassandra in its formatting. Apache HBase:该数据管理平台在格式上与 Cassandra 有相似之处。 HBase also comes with the same performance metrics as Cassandra. HBase 还具有与 Cassandra 相同的性能指标。
  • OpenTSDB: The platform is perfect for IoT user-cases where the information gathers thousands within seconds. OpenTSDB:该平台非常适合在几秒钟内收集数千信息的物联网用户案例。 The collected questions are needed for the dashboards.仪表板需要收集的问题。

Hope it helps.希望能帮助到你。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM