簡體 English 中英

從配置單元表中選擇具有給定條件的重復項的記錄

[英]selecting records from hive table where there are duplicates with a given criteria

原文 2019-01-13 14:10:26 3 2 hadoop/ select/ hive/ duplicates

表1在A列中具有相同頻率值的重復條目。 我需要從這些記錄中選擇一個隨機記錄。如果重復條目包含“未知”作為B列值（如記錄“ d”中），請從其他行中選擇一個。 我需要滿足上述條件的選擇語句。 謝謝。

2 個解決方案

這些條件可以使用優先case在表達order by與像函數row_number 。

select A,B,frequency,timekey
from (select t.*
            ,row_number() over(partition by A order by cast((B = 'unknown') as int), B) as rnum
      from tbl t
     ) t 
where rnum = 1

在這里，對於每組A行，我們首先對B = 'unknown'以外的行進行優先級排序，然后對B值進行排序。

使用row_number分析函數。 如果要先選擇unknown記錄，請使用以下查詢：

select  A, B, Frequency, timekey
from
(select 
       A, B, Frequency, timekey,
       row_number() over(partition by A,Frequency order by case when B='unknown' then 1 else 0 end) rn
)s where rn=1

如果要選擇unknown如果存在），請在上面的查詢中使用以下row_number ：

row_number() over(partition by A,Frequency order by case when B='unknown' then 0 else 1 end) rn

從 HIVE 表中摻雜重復，需要寫出丟棄的記錄和抓取計數

[英]Dopping duplicates from HIVE table, need to write out the dropped records and grab count

配置單元並選擇不匹配的記錄

[英]Hive and selecting non matching records

如何通過創建與現有表具有相同結構的新配置單元表來刪除配置單元表中的重復項？

[英]how to delete duplicates from hive table by creating new hive table with same structure with the existing table?

蜂巢hadoop：從表中選擇數據獲取錯誤

[英]hive hadoop: selecting data from table getting error

將6000億條記錄從一個配置單元表加載到另一個

[英]Loading 600billion records from 1 hive table into another

更新配置單元表中的增量記錄

[英]Update delta records in hive table

在Hive表上刪除/映射重復項鍵？

[英]Remove/Mapping duplicates key on Hive table?

蜂巢在本地存儲表的位置？

[英]Where hive stores table locally?

帶有拼花地板數據的配置單元外部表未選擇數據

[英]Hive external table with parquet data not selecting data

配置單元表/數據庫設置在哪里？

[英]Where are the Hive Table/Database settings?

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 從 HIVE 表中摻雜重復，需要寫出丟棄的記錄和抓取計數配置單元並選擇不匹配的記錄如何通過創建與現有表具有相同結構的新配置單元表來刪除配置單元表中的重復項？蜂巢hadoop：從表中選擇數據獲取錯誤將6000億條記錄從一個配置單元表加載到另一個更新配置單元表中的增量記錄在Hive表上刪除/映射重復項鍵？蜂巢在本地存儲表的位置？帶有拼花地板數據的配置單元外部表未選擇數據配置單元表/數據庫設置在哪里？

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM