簡體   English   中英

如何對Pig中的2個日志文件求和

[英]How to sum 2 log files in pig

我有問題,總計2個日志文件。

示例文件:

  1. 文件1

    id用戶視圖

    1 AAA 2

    2 BBB 5

    3 CCC 9

  2. 文件2

    ID用戶查看地址

    1 AAA 5 XXX

    2 BBB 2年

    6 FFF 4 ZZZ

我想通過ID和求和(視圖)求和兩個文件,我希望輸出:

輸出:

id user view address
1  AAA  7    XXX
2  BBB  7    YYY

我應該嘗試代碼聯接兩個文件,但是我不對兩個文件求和:

我的代碼:

inputdata = LOAD '/user/hdfs/tes/part-1' AS (
    id:chararray, 
    user:chararray, 
    view:int
);


inputdata2 = LOAD '/user/hdfs/tes/part-2' AS (
    id:chararray, 
    user:chararray, 
    view:int,
    address:chararray
);


joined = JOIN inputdata BY id LEFT OUTER, inputdata2 by id;

outputlist = FOREACH joined {

        GENERATE
        inputdata::id, 
        inputdata::user, 
        --sum(inputdata2::view), 
        inputdata2::address;


}

dump outputlist;

IAM問題,如何對兩個日志文件中的視圖求和。

謝謝。

在foreach循環中獲取聯接結果並匯總視圖值。

A = LOAD 'file1.dat' using PigStorage(' ') AS (a:chararray,b:chararray,c:int);                  
B = LOAD 'file2.dat' using PigStorage(' ') AS (a:chararray,b:chararray,c:int,d:chararray);      
C = JOIN A by a,B by a;                                                                                                                           
D = FOREACH C GENERATE A::a as id,A::b as user,A::c + B::c as view,B::d as address;

輸出:

(1,AAA,7,XXX)
(2,BBB,7,YYY)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM