[英]How can I compute the combination of two fields using Pig script?
I have input like: 我输入像:
(1, (a, b, c))
(2, (e, f, g))
The output I expected is like: 我期望的输出是这样的:
(1, a)
(1, b)
(1, c)
(2, e)
(2, f)
(2, g)
May be it will help you: 可能会帮助您:
A = LOAD 'data' AS (int:a, t1:tuple(t1a:chararray, t1b:chararray,t1c:chararray));
B = FOREACH A GENERATE a,t1.$0,t1.$1,t1.$2;
C = group B by a;
X = COGROUP C BY a, C BY $0;
DUMP X;
Can you try this? 你可以试试这个吗?
A = LOAD 'input.txt' USING PigStorage() AS (f1:int,T:tuple(f2:chararray,f3:chararray,f4:chararray));
B = FOREACH A GENERATE f1,FLATTEN(TOBAG(T.f2,T.f3,T.f4));
DUMP B;
Step 1 : Load input file 步骤1 :加载输入文件
1 a,b,c 1 a,b,c
2 e,f,g 2 e,f,g
as 如
crude_input = load '' USING PigStorage() AS (id:int, ip_tuple:tuple(val1:chararray, val2:chararray, val3:chararray)); raw_input = load''使用PigStorage()AS(id:int,ip_tuple:tuple(val1:chararray,val2:chararray,val3:chararray));
dump crude_input; 转储raw_input;
(1,(a,b,c)) (1,(a,b,c))
(2,(e,f,g)) (2,(e,f,g))
Step 2 : 第二步 :
crude_flatened = foreach crude_input GENERATE id, FLATTEN($1); raw_flatened = foreach粗输入GENERATE id,FLATTEN($ 1);
This will generate 这将产生
(1,a,b,c) (1,a,b,c)
(2,e,f,g) (2,e,f,g)
Step 3: 第三步:
output_data = foreach crude_flatened generate id, FLATTEN(TOBAG(ip_tuple::val1,ip_tuple::val2,ip_tuple::val3)); output_data = foreach raw_flatened生成ID,FLATTEN(TOBAG(ip_tuple :: val1,ip_tuple :: val2,ip_tuple :: val3)));
(1,a) (1,a)
(1,b) (1,b)
(1,c) (1,c)
(2,e) (2,e)
(2,f) (2,f)
(2,g) (2克)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.