[英]Mapreduce-java: calcutaing average for array list
I have and assignment for mapreduce and I am very new in mapreduce programming. 我已经为mapreduce分配了作业,而我在mapreduce编程中还是一个新手。 I want to calculate for each year and specific city what is average, min and max values. 我想计算每年和特定城市的平均值,最小值和最大值。 so here is my sample input 这是我的示例输入
Calgary,AB,2009-01-07,604680,12694,2.5207754,0.065721168,0.025668362,0.972051954,0.037000279,0.022319018,,,0.003641149,,,0.002936745,,,0.016723641 Calgary,AB,2009-12-30,604620,12694,2.051769654,0.060114973,0.034026918,1.503277516,0.054219005,0.023258217,,,0.00354166,,,0.003361414,,,0.122375131 Calgary,AB,2010-01-06,604680,12266,4.015745522,0.097792741,0.032738892,0.368454554,0.019228992,0.032882053,,,0.004778065,,,0.003190444,,,0.064203865 Calgary,AB,2010-01-13,604680,12551,3.006492921,0.09051656,0.041508534,0.215395047,0.012081755,0.023706119,,,0.004231772,,,0.003083003,,,0.155212503 卡尔加里,AB,2009-01-07,604680,12694,2.5207754,0.065721168,0.025668362,0.972051954,0.037000279,0.022319018,,0.003641149 ,,, 0.002936745,,0.016723641卡尔加里,AB,2009-12-30,604620,12694 ,2.051769654,0.060114973,0.034026918,1.503277516,0.054219005,0.023258217,``0.00354166''``0.003361414''``0.122375131 Calgary,AB,2010-01-06,604680,12266,4.015745522,0.097792741,0.032738892,0.368454554,0.019228992,0.032882053, ,, 0.004778065 ,,, 0.003190444,,0.064203865,卡尔加里,AB,2010-01-13,604680,12551,3.006492921,0.09051656,0.041508534,0.215395047,0.012081755,0.023706119,,0.004231772 ,,, 0.003083003 ,,, 0.155212503
I know how to find city and year I am using this code: 我知道如何使用此代码查找城市和年份:
String line = value.toString();
String[] tokens = line.split(",");
String[] date = tokens[2].split("-");
String year = date[0];
String location = tokens[0];
Now I want to find these two numbers in each line (eg2.5207754,0.065721168 , not exactly the same but all the number after third and fourth comma) and find an average, a min and max. 现在,我想在每行中找到这两个数字(例如2.5207754,0.065721168,不完全相同,而是第三和第四个逗号之后的所有数字),并找到一个平均值,最小值和最大值。
and in the output should looks like this: 在输出中应如下所示:
Calgary 2009 average: "" , min; 卡尔加里2009年平均值:“”,分钟; "" , max: "" Calgary 2010 average: "" , min; “”,最大值:“”卡尔加里2010年平均值:“”,最小值; "" , max: "" “”,最大:“”
I was trying to use this code to find the values in each line but because the data set is in not the same in the each line I got error (in the part there is no data or bigger that this length) 我试图使用此代码查找每一行中的值,但由于每一行中的数据集都不相同,因此出现了错误(在该部分中没有数据或该长度更大的数据)
float number = 0;
float number2 = 0 ;
char a;
char c;
a = line.charAt(34);
c = line.charAt(44);
if (a == ',')
{
number = Float.parseFloat(line.substring(35, 44));
}
else
{
number = Float.parseFloat(line.substring(35, 46));
}
if (c == ',')
{
number2 = Float.parseFloat(line.substring(45, 56));
} else
{
number = Float.parseFloat(line.substring(47, 58));
}
Text numbers = new Text(number + " " + number2 + " ");
Then I was trying to use this code and the same as above it doesn't work: 然后,我尝试使用此代码,但与上面的代码相同,它不起作用:
String number = tokens[4];
String number2 = tokens[5];
so can you help me to do this project? 你能帮我做这个项目吗?
Looking at your input, it seems your records are separated by space. 查看您的输入,看来您的记录之间是用空格隔开的。 You can first split using " " and then get the individual values and use them for calculation 您可以先使用“”分割,然后获取各个值并将其用于计算
String[] arr = line.split(" ");
for(String val : arr){
String[] dataArr = val.split(",");
String city = dataArr[0];
String date = dataArr[2];
String v1 = dataArr[5];
String v2 = dataArr[6];
System.out.println("city: "+city +" date: "+ date +" v1: "+ v1+"v2: "+ v2);
}
city: Calgary date: 2009-01-07 v1: 2.5207754v2: 0.065721168 city: Calgary date: 2009-12-30 v1: 2.051769654v2: 0.060114973 city: Calgary date: 2010-01-06 v1: 4.015745522v2: 0.097792741 city: Calgary date: 2010-01-13 v1: 3.006492921v2: 0.09051656 city: Calgary date: 2009-01-07 v1: 2.5207754v2: 0.065721168 城市:卡尔加里日期:2009-01-07 v1:2.5207754v2:0.065721168城市:卡尔加里日期:2009-12-30 v1:2.051769654v2:0.060114973城市:卡尔加里日期:2010-01-06 v1:4.015745522v2:0.097792741城市:卡尔加里日期:2010-01-13 v1:3.006492921v2:0.09051656城市:卡尔加里日期:2009-01-07 v1:2.5207754v2:0.065721168
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.