I just started to learn mapreduce with Octo module with the word count example. I try to count the words in the dir hw3data (as specified below). My PC works as both the server and client.
I started with my windows cmd with 2 terminals
server: octo.py server wordcount.py It seems the server side started without problem
client: octo.py client localhost It seems that python can't find the txt files I stored in the hw3data dir, so it says no work, sleeping. So anyone can help?
The wordcount.py code is below
import glob
text_files=glob.glob('C:/Python27/octopy-0.1/hw3data/*.txt')
def file_contents(file_name):
f=open(file_name)
try:
return f.read()
finally:
f.close()
source=dict((file_name,file_contents(file_name)) for file_name in text_files)
f=open('outfile','w')
def final(key,value):
print key,value
f.write(str((key,value)))
def mapfn(key,value):
for line in value.splitlines():
for word in line.split():
yield word.lower(),1
def reducefn(key,value):
return key,len(value)
Verify if your data files have a ".txt" in his names. I'm current working in this problem to my homework #3. Good Luck!
Change the following code
text_files=glob.glob('C:/Python27/octopy-0.1/hw3data/*.txt')
to
text_files=glob.glob('C:/Python27/octopy-0.1/hw3data/*')
and try. I guess the files in the folder do not have the extensions.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.