简体   繁体   中英

Hadoop wordcount example in R

I installed hadoop-3.0.0-alpha2 and i'm trying to execute a Mapreduce wordcount example. I created the mapper.R and reducer.R scripts, but when I try to execute the job

hadoop jar /home/rania/Downloads/hadoop-streaming-0.20.204.0.jar \
-file  /home/rania/Downloads/mapper.R  -mapper /home/rania/Downloads/mapper.R \
-file /home/rania/Downloads/reducer.R  -reducer /home/rania/Downloads/reducer.R \
-input /readme -output /RCount

I get the following

2017-06-04 08:12:42,252 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2017-06-04 08:12:43,119 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
packageJobJar: [/home/rania/Downloads/mapper.R, /home/rania/Downloads/reducer.R] [] /tmp/streamjob5589642909909116910.jar tmpDir=null
2017-06-04 08:12:43,303 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
2017-06-04 08:12:43,603 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
2017-06-04 08:12:43,734 ERROR streaming.StreamJob: Error launching job , Output path already exists : Output directory hdfs://localhost:9000/RCount already exists
Streaming Job Failed!

What could be wrong ? Thanks!

Try running your scripts to an output directory that doesn't already exist on your hdfs. A new directory will be created with whatever name you choose. If you want to use the same directory again you must delete the files in it and remove it prior to running your script again with the same output directory name /RCount

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM