简体   繁体   English

Hive - 以分钟为单位计算字符串类型时间戳差异

[英]Hive - calculating string type timestamp differences in minutes

I'm novice to SQL (in hive) and trying to calculate every anonymousid's time spent between first event and last event in minutes.我是 SQL (在蜂巢中)的新手,并试图以分钟为单位计算每个匿名用户在第一个事件和最后一个事件之间花费的时间。 The resource table's timestamp is formatted as string, like: "2020-12-24T09:47:17.775Z".资源表的时间戳格式为字符串,例如:“2020-12-24T09:47:17.775Z”。 I've tried in two ways:我尝试了两种方式:

1- Cast column timestamp to bigint and calculated the difference from main table. 1- 将列时间戳转换为 bigint 并计算与主表的差异。

select anonymousid, max(from_unixtime(cast('timestamp' as bigint)) - min(from_unixtime(cast('timestamp' as bigint)) from db1.formevent group by anonymousid

I got NULLs after implementing this as a solution.在将其作为解决方案实施后,我得到了 NULL。

2- Create a new table from main resource, put conditions to call with 'where' and tried to convert 'timestamp' to date format without any min-max calculation. 2-从主资源创建一个新表,使用“where”设置调用条件,并尝试将“timestamp”转换为日期格式,而不进行任何最小-最大计算。

create table db1.successtime as select anonymousid, pagepath,buttontype, itemname, 'location', cast(to_date(from_unixtime(unix_timestamp('timestamp', "yyyy-MM-dd'T'HH:mm:ss.SSS"),'HH:mm:ss') as date) from db1.formevent where pagepath = "/account/sign-up/" and itemname = "Success" and 'location' = "Standard"

Then I got NULLs again and I left.然后我又得到了 NULL,然后我离开了。 It looks like this看起来像这样

Is there any way I can reformat and calculate time difference in minutes between first and last event ('timestamp') and take the average grouped by 'location'?有什么方法可以重新格式化和计算第一个事件和最后一个事件之间的时间差(“时间戳”)并取按“位置”分组的平均值?

select anonymousid,
       (max(unix_timestamp(timestamp, "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'")) - 
        min(unix_timestamp(timestamp, "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'")) 
       ) / 60
from db1.formevent
group by anonymousid;

From your description, this should work:根据您的描述,这应该有效:

select anonymousid,
       (max(unix_timestamp(timestamp, 'yyyy-MM-dd'T'HH:mm:ss.SSS'),'HH:mm:ss') - 
        min(unix_timestamp(timestamp, 'yyyy-MM-dd'T'HH:mm:ss.SSS'),'HH:mm:ss') 
       ) / 60
from db1.formevent
group by anonymousid;

Note that the column name is not in single quotes.请注意,列名不在单引号中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM