简体   繁体   中英

Hadoop Streaming Job - python stuck at map 0% reduce 0% in CDH4.5

i am using a hadoop streaming job in cloudera distribution 4.5 , but it does not advance beyond the map 0% stage, also I am not sure where are the logs that I can check, pardon my naivety in hadoop.

[amgen@sa-dpoc10 code]$ hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-      mapreduce/contrib/streaming/hadoop-streaming-2.0.0-mr1-cdh4.5.0.jar -mapper  /home/amgen/Amgen_UC1/code/mapper.py -file  /home/amgen/Amgen_UC1/code/mapper.py -reducer /home/amgen/Amgen_UC1/code/reducer.py -file /home/amgen/Amgen_UC1/code/reducer.py  -input /user/amgen/Amgen_UC1/input/Corpus_VoiceBase.txt -output /user/amgen/Amgen_UC1/output_t1
packageJobJar: [/home/amgen/Amgen_UC1/code/mapper.py,/home/amgen/Amgen_UC1/code/reducer.py, /tmp/hadoop-amgen/hadoop-unjar665443284079561966/] [] /tmp/streamjob722830427268220086.jar tmpDir=null
14/02/02 07:16:52 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
14/02/02 07:16:53 INFO mapred.FileInputFormat: Total input paths to process : 1
14/02/02 07:16:53 INFO streaming.StreamJob: getLocalDirs(): [/tmp/hadoop amgen/mapred/local]
14/02/02 07:16:53 INFO streaming.StreamJob: Running job: job_201401231022_0068
14/02/02 07:16:53 INFO streaming.StreamJob: To kill this job, run:
14/02/02 07:16:53 INFO streaming.StreamJob: UNDEF/bin/hadoop job  -Dmapred.job.tracker=sa-dpoc16.zs.local:8021 -kill job_201401231022_0068
14/02/02 07:16:53 INFO streaming.StreamJob: Tracking URL: http://sa-dpoc16.zs.local:50030/jobdetails.jsp?jobid=job_201401231022_0068
14/02/02 07:16:54 INFO streaming.StreamJob:  map 0%  reduce 0%

Please let me know if you want any configuration file.

You can check the namenode logs through namenode UI

http://yourdomain.com:50070/dfshealth.jsp

There you can find the hyperlink for the namenodelogs that will open list of logs and xmls. Usually the jobs logs are under userlogs folder

You can also track the jobs using job tracker UI

http://yourdomain.com:50030/jobtracker.jsp

The job output above includes a link to the job details

You can see if the mappers are failing and view the stdout and stderr of your mappers there to see if there are any python exceptions in there.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM