简体   繁体   English

hadoop蜂巢计数并发

[英]hadoop hive count concurrency

How to implement it in hadoop? 如何在hadoop中实现它?

In hive, I have a table with lots columns, which two of them are begin_time, end_time. 在蜂巢中,我有一个包含很多列的表,其中两个是begin_time,end_time。

I need to count the number on the each time 我每次都要数一下

a piece of the table is this: 一张桌子是这样的:

begin_time                  end_time
2011.04.26 10:19:06^A2011.04.26 10:20:22
2011.04.26 10:19:08^A2011.04.26 10:21:49
2011.04.26 10:19:08^A2011.04.26 11:18:46
2011.04.26 10:19:09^A2011.04.26 12:08:36
2011.04.26 10:19:09^A2011.04.26 11:00:16
2011.04.26 10:19:11^A2011.04.26 10:19:17
2011.04.26 10:19:12^A2011.04.26 10:46:21
2011.04.26 10:19:13^A2011.04.26 10:55:43
2011.04.26 10:19:17^A2011.04.26 10:19:41
2011.04.26 10:19:18^A2011.04.26 10:34:41

the result I want is how many people is in on a specific time. 我想要的结果是在特定时间有多少人。

eg on 2011.04.26 10:19:08, there 3 visitor on course there one in 19:06, and 2 in 19:08. 例如,在2011.04.26 10:19:08上,有3位访客在赛道上,19:06一位,2位在19:08。

and 2011.04.26 10:19:18 is 9, course ten but one leave on 2011.04.26 10:19:17 和2011.04.26 10:19:18为9,课程10,但在2011.04.26 10:19:17休假

the desired result for piece is 所需的结果是

2011.04.26 10:19:06 1
2011.04.26 10:19:08 3
2011.04.26 10:19:09 5
2011.04.26 10:19:11 6
2011.04.26 10:19:12 7
2011.04.26 10:19:13 8
2011.04.26 10:19:17 9
2011.04.26 10:19:18 9

Any help is much appreciated and welcome. 任何帮助深表感谢和欢迎。

You can try this on hive (assume the table name is test_log): 您可以在配置单元上尝试此操作(假设表名称为test_log):

select /*+ MAPJOIN(driven) */ driven.time, count(*)    
from         
    (select time 
     from 
     (select begin_time time from test_log union all 
      select end_time time from test_log) u  
     group by time) driven
join test_log l on true
where
    driven.time between l.begin_time and l.end_time
group by driven.time

Probably not the best solution but at least it works. 可能不是最好的解决方案,但至少可以奏效。 You can add some filter on the driven subquery to reduce the data set. 您可以在驱动的子查询上添加一些过滤器以减少数据集。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM