简体   繁体   English

Hadoop:无法运行 MapReduce 作业(python)

[英]Hadoop: Unable to run MapReduce job (python)

I'm trying to run a simple Map-Reduce job using Hadoop, but I can't seem to get it to work.我正在尝试使用 Hadoop 运行一个简单的 Map-Reduce 作业,但我似乎无法让它工作。

Mapper code:映射器代码:

#!/usr/bin/python
import json
import sys

#can't really do anything in the reducer most of the time, there's no computations the reducer need to process
jsonString = ''

for line in sys.stdin:
    line = line.strip()
    try:
        convertedData = json.loads(line)

        #Sometimes we pass in an invalid title, so need to check if medianPay is empty
        if convertedData['success'] == True:
            lowPay = str(convertedData['response']['payLow'])
            #medianPay = str(convertedData['response']['payMedian'])
            highPay = str(convertedData['response']['payHigh'])

            #adding national to identify this data as national data
            if lowPay:
                print convertedData['response']['jobTitle'] + '\t' 'lowPay:'+ lowPay
                #print convertedData['response']['jobTitle'] + '\t' 'medianPay:'+ medianPay
                print convertedData['response']['jobTitle'] + '\t' 'highPay:'+ highPay
    except ValueError:
        pass

Reducer Code: #!/usr/bin/python from future import division import sys减速器代码:#!/usr/bin/python from future import Division import sys

curJobTitle = ''
lowSalary = 0;
highSalary = 0;

for line in sys.stdin:      
    line = line.strip()
    items = line.split('\t')        
    key, value = line.split('\t')   
    if key != curJobTitle:      
        if (curJobTitle != ''):
            percent_diff = ((highSalary - lowSalary)/((lowSalary + highSalary) / 2)) * 100
            print curJobTitle + ':' + str(abs(percent_diff))

        curJobTitle = key
        type, salary = value.split(':')     
        if type == 'lowPay':
            lowSalary = float(salary)
        elif type == 'highPay':
            highSalary = float(salary)
    else:
        type, salary = value.split(':')     
        if type == 'lowPay':
            lowSalary = float(salary)
        elif type == 'highPay':
            highSalary = float(salary)

if lowSalary != 0 and highSalary != 0:
    percent_diff = ((highSalary - lowSalary)/((lowSalary + highSalary) / 2)) * 100
    print curJobTitle + ':' + str(abs(percent_diff))

Terminal output:端子 output:

16/05/04 17:30:30 INFO mapreduce.Job:  map 0% reduce 0%
16/05/04 17:30:36 INFO mapreduce.Job: Task Id : attempt_1461978900532_0179_m_000001_0, Status : FAILED
Error: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        ... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
        ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        ... 17 more
Caused by: java.lang.RuntimeException: configuration exception
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222)
        at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
        ... 22 more
Caused by: java.io.IOException: Cannot run program "/hadoop/yarn/local/usercache/tsahn001/appcache/application_1461978900532_0179/container_e17_1461978900532_0179_01_000003/./NationalSalaryMapper.py": error=2, No such file or directory
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209)
        ... 23 more
Caused by: java.io.IOException: error=2, No such file or directory
        at java.lang.UNIXProcess.forkAndExec(Native Method)
        at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
        at java.lang.ProcessImpl.start(ProcessImpl.java:130)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
        ... 24 more

Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

16/05/04 17:30:36 INFO mapreduce.Job: Task Id : attempt_1461978900532_0179_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        ... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
        ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        ... 17 more
Caused by: java.lang.RuntimeException: configuration exception
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222)
        at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
        ... 22 more
Caused by: java.io.IOException: Cannot run program "/hadoop/yarn/local/usercache/tsahn001/appcache/application_1461978900532_0179/container_e17_1461978900532_0179_01_000002/./NationalSalaryMapper.py": error=2, No such file or directory
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209)
        ... 23 more
Caused by: java.io.IOException: error=2, No such file or directory
        at java.lang.UNIXProcess.forkAndExec(Native Method)
        at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
        at java.lang.ProcessImpl.start(ProcessImpl.java:130)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
        ... 24 more

16/05/04 17:30:40 INFO mapreduce.Job: Task Id : attempt_1461978900532_0179_m_000001_1, Status : FAILED
Error: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        ... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
        ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        ... 17 more
Caused by: java.lang.RuntimeException: configuration exception
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222)
        at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
        ... 22 more
Caused by: java.io.IOException: Cannot run program "/hadoop/yarn/local/usercache/tsahn001/appcache/application_1461978900532_0179/container_e17_1461978900532_0179_01_000004/./NationalSalaryMapper.py": error=2, No such file or directory
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209)
        ... 23 more
Caused by: java.io.IOException: error=2, No such file or directory
        at java.lang.UNIXProcess.forkAndExec(Native Method)
        at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
        at java.lang.ProcessImpl.start(ProcessImpl.java:130)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
        ... 24 more

16/05/04 17:30:41 INFO mapreduce.Job: Task Id : attempt_1461978900532_0179_m_000000_1, Status : FAILED
Error: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        ... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
        ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        ... 17 more
Caused by: java.lang.RuntimeException: configuration exception
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222)
        at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
        ... 22 more
Caused by: java.io.IOException: Cannot run program "/hadoop/yarn/local/usercache/tsahn001/appcache/application_1461978900532_0179/container_e17_1461978900532_0179_01_000005/./NationalSalaryMapper.py": error=2, No such file or directory
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209)
        ... 23 more
Caused by: java.io.IOException: error=2, No such file or directory
        at java.lang.UNIXProcess.forkAndExec(Native Method)
        at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
        at java.lang.ProcessImpl.start(ProcessImpl.java:130)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
        ... 24 more

16/05/04 17:30:44 INFO mapreduce.Job: Task Id : attempt_1461978900532_0179_m_000001_2, Status : FAILED
Error: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        ... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
        ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        ... 17 more
Caused by: java.lang.RuntimeException: configuration exception
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222)
        at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
        ... 22 more
Caused by: java.io.IOException: Cannot run program "/hadoop/yarn/local/usercache/tsahn001/appcache/application_1461978900532_0179/container_e17_1461978900532_0179_01_000006/./NationalSalaryMapper.py": error=2, No such file or directory
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209)
        ... 23 more
Caused by: java.io.IOException: error=2, No such file or directory
        at java.lang.UNIXProcess.forkAndExec(Native Method)
        at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
        at java.lang.ProcessImpl.start(ProcessImpl.java:130)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
        ... 24 more

16/05/04 17:30:45 INFO mapreduce.Job: Task Id : attempt_1461978900532_0179_m_000000_2, Status : FAILED
Error: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        ... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
        ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        ... 17 more
Caused by: java.lang.RuntimeException: configuration exception
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222)
        at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
        ... 22 more
Caused by: java.io.IOException: Cannot run program "/hadoop/yarn/local/usercache/tsahn001/appcache/application_1461978900532_0179/container_e17_1461978900532_0179_01_000007/./NationalSalaryMapper.py": error=2, No such file or directory
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209)
        ... 23 more
Caused by: java.io.IOException: error=2, No such file or directory
        at java.lang.UNIXProcess.forkAndExec(Native Method)
        at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
        at java.lang.ProcessImpl.start(ProcessImpl.java:130)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
        ... 24 more

I have tested my MapReduction code locally using cat |我已经使用 cat | 在本地测试了我的 MapReduction 代码。 |, and it seems to be functioning correctly. |,并且它似乎运行正常。 However, whenever I try to run the MapReduce job on Hadoop, I always get an error (above is an example. I'm running my job with:但是,每当我尝试在 Hadoop 上运行 MapReduce 作业时,我总是会收到错误消息(上面是一个示例。我正在运行我的作业:

hadoop jar /usr/hdp/2.3.4.0-3485/hadoop-mapreduce/hadoop-streaming.jar \ -file /home/tsahn001/MapReduceCode/NationalSalaryMapper.py -mapper 'NationalSalaryMapper.py' \ -file /home/tsahn001/MapReduceCode/NationalSalaryReducer.py -reducer 'NationalSalaryReducer.py' \ -input data/glassdoor_national_data -output test_national

Is my map reduce not working because of how I'm trying to execute it?我的 map reduce 是否因为我试图执行它而无法工作? If so, how do I fix this?如果是这样,我该如何解决这个问题?

Instead of using print statements, your mapper should be emitting key value pairs to the reducer via sys.stdout. 您的映射器应该使用sys.stdout将键值对发送给reducer,而不是使用print语句。 For example, instead of: 例如,代替:

print convertedData['response']['jobTitle'] + '\t' 'lowPay:'+ lowPay

your mapper should emit to the reducer like this: 您的映射器应该像这样向reducer发出:

emit_string = convertedData['response']['jobTitle'] + '|' 'lowPay:'+ lowPay
sys.stdout.write("{0}\t1\n".format(emit_string)

Notice the tab (\\t) and the value of 1 in the emit. 注意选项卡(\\ t)和发射中的值1。 I also replaced the tab in your print statement with a pipe symbol, since the tab is what will be used by the reducer to separate the key (string) and value (count). 我还用管道符号替换了打印语句中的选项卡,因为减速器将使用该选项卡来分隔键(字符串)和值(计数)。

according to the error you output , it indicates that you have specified the wrong path for your python script , Caused by: java.io.IOException: error=2, No such file or directory check the path , and try again . 根据输出的错误,表明您为python脚本指定了错误的路径,原因:java.io.IOException:error = 2,没有此类文件或目录检查路径,然后重试。

good luck ! 祝好运 !

I am having the similar issue.我有类似的问题。 Anyone able to solve it?有谁能解决吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM