简体   繁体   English

几乎实时存储和导出点击流数据的最佳方法是什么

[英]What is the best way to store and export click stream data in almost real time

Let us say I have a website which is getting lots of hits. 可以说,我有一个网站,它获得了很多好评。 I need to store the click data in some database so that it can be used for reporting and monitoring purposes. 我需要将点击数据存储在某个数据库中,以便可以将其用于报告和监控目的。 The click data will contain information like who is referring the users to this site, where users are coming from, what time do they come etc. Is there a way to store and then analyze this data in lets say 10 minute intervals so that you can get an overview of how the site is performing every 10 minutes. 点击数据将包含诸如以下信息:谁将用户引荐到该网站,用户来自何处,他们什么时候来等等。是否有一种存储方式,然后以10分钟为间隔进行分析,以便您可以大致了解网站每10分钟的运行情况。 What type of database will be best suited for this purpose and what kind of analyzing tools can quickly generate meaningful information from this data. 哪种类型的数据库最适合此目的,哪种类型的分析工具可以从这些数据中快速生成有意义的信息。 One option for analysis I am thinking of is using some variation of map-reduce to run the queries on this data. 我正在考虑的一种分析方法是使用map-reduce的一些变体对这些数据运行查询。

Although I haven't tried it yet OpenTSDB looks promising. 尽管我还没有尝试过,但OpenTSDB看起来很有希望。

Quote: 引用:

OpenTSDB is a distributed, scalable Time Series Database (TSDB) written on top of HBase. OpenTSDB是在HBase之上编写的分布式可伸缩时间序列数据库(TSDB)。 OpenTSDB was written to address a common need: store, index and serve metrics collected from computer systems (network gear, operating systems, applications) at a large scale, and make this data easily accessible and graphable. OpenTSDB旨在满足一个普遍的需求:大规模存储,索引和服务从计算机系统(网络设备,操作系统,应用程序)收集的指标,并使这些数据易于访问和可图形化。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM