简体   繁体   中英

mapreduce wordcount example in using Octo module

I just started to learn mapreduce with Octo module with the word count example. I try to count the words in the dir hw3data (as specified below). My PC works as both the server and client.

I started with my windows cmd with 2 terminals

server: octo.py server wordcount.py It seems the server side started without problem

client: octo.py client localhost It seems that python can't find the txt files I stored in the hw3data dir, so it says no work, sleeping. So anyone can help?

The wordcount.py code is below

wordcount.py

server

import glob

text_files=glob.glob('C:/Python27/octopy-0.1/hw3data/*.txt')

def file_contents(file_name):
    f=open(file_name)
    try:
        return f.read()
    finally:
        f.close()

source=dict((file_name,file_contents(file_name)) for file_name in text_files)

f=open('outfile','w')
def final(key,value):
    print key,value
    f.write(str((key,value)))

client

def mapfn(key,value):

      for line in value.splitlines():

          for word in line.split():

               yield word.lower(),1

def reducefn(key,value):

       return key,len(value)

Verify if your data files have a ".txt" in his names. I'm current working in this problem to my homework #3. Good Luck!

Change the following code

text_files=glob.glob('C:/Python27/octopy-0.1/hw3data/*.txt')

to

text_files=glob.glob('C:/Python27/octopy-0.1/hw3data/*')

and try. I guess the files in the folder do not have the extensions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM