简体   繁体   English

提取 SQL (hive) 中数组的最后 N 个元素

[英]Extract last N elements of an array in SQL (hive)

I have a column with arrays and I want to extract the X last elements in an array.我有一个包含 arrays 的列,我想提取数组中的最后 X 个元素。

Example trying to extract the last two elements:尝试提取最后两个元素的示例:

     Column A
     ['a', 'b', 'c']
     ['d', 'e']
     ['f', 'g', 'h', 'i']

Expected output:预期 output:

     Column A
    ['b', 'c']
    ['d', 'e']
    ['h', 'i']

Best case scenario would be to do it without using a UDF最好的情况是不使用 UDF

One method using reverse, explode, filtering and re-assembling array again:再次使用反向、分解、过滤和重新组装数组的一种方法:

with your_table as (
select stack (4,
0, array(), --empty array to check it works if no elements or less than n
1, array('a', 'b', 'c'),
2, array('d', 'e'),
3, array('f', 'g', 'h', 'i')
) as (id, col_A)
)

select s.id, collect_list(s.value) as col_A 
from
(select s.id, a.value, a.pos
  from your_table s
       lateral view outer posexplode(split(reverse(concat_ws(',',s.col_A)),',')) a as pos, value
where a.pos between 0 and 1 --last two (use n-1 instead of 1 if you want last n)  
distribute by s.id sort by a.pos desc --keep original order
)s
group by s.id

Result:结果:

s.id    col_a
0   []
1   ["b","c"]
2   ["d","e"]
3   ["h","i"]

More elegant way using brickhouse numeric_range UDF in this answer这个答案中使用砖房numeric_range UDF 的更优雅的方式

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM