[英]SQL-HIVE-PIG -Mapreduce
There are 5 columns in each line and those 5 columns are commonly separated by comma 每行有5列,而这5列通常用逗号分隔
1 column is name
2nd column is date_of_purchase
3rd column is product
4th column is mode of payment
5th column is total_amount
Hope you understood what data it contains 希望您了解其中包含的数据
surender,2014-03-09,TV,OFFLINE,20000
surender,2014-01-01,Mobile,ONLINE,18000
Raja,2014-09-21,Laptop,ONLINE,30000
Surender,2014-10-12,Laptop,ONLINE,40000
Raja,2014-FEB-11,MusicSystem,ONLINE,2000
Kumar,2014-07-09,Ipod,OFFLINE,4000
Kumar,2014-06-08,TV,ONLINE,20000
Raja,2014-11-07,SPeakers,OFFLINE,8000
Kumar,2014-10-18,Laptop,ONLINE,30000
What i need is i want to see how much amount each person has spent via online mode and offline mode 我需要的是我想看看每个人通过在线模式和离线模式花了多少钱
basically i need the reducer output should like below 基本上我需要减速器输出应该像下面
surender OFFLINE 20000
surender ONLINE 58000
Raja OFFLINE 8000
Raja ONLINE 32000
Kumar OFFLINE 4000
Kumar ONLINE 50000
And the final output should be like this: 最终输出应如下所示:
surender 20000 58000
Raja 8000 32000
Kumar 4000 50000
You can give me a hive or pig query or either a mapreduce program 您可以给我一个蜂巢或猪查询或mapreduce程序
A = LOAD 'file_name' using PigStorage(',') as (name:chararray,date:chararray,product:chararray,mode:chararray,total:long);
B = GROUP A BY (name,mode);
C = FOREACH B GENERATE group.name as name,group.mode, SUM(total) as total;
D = GROUP C BY name;
E = FOREACH D GENERATE group, C.total;
if your data like the sample you provided has different spellings then you need to convert to uppercase before grouping 如果您的数据(如您提供的样本)具有不同的拼写,则需要在分组之前转换为大写
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.