简体   繁体   English

使用Octo模块的mapreduce wordcount示例

[英]mapreduce wordcount example in using Octo module

I just started to learn mapreduce with Octo module with the word count example. 我刚刚开始使用带字数示例的Octo模块学习mapreduce。 I try to count the words in the dir hw3data (as specified below). 我尝试计算目录hw3data中的字数(如下所示)。 My PC works as both the server and client. 我的电脑既可以用作服务器,也可以用作客户端。

I started with my windows cmd with 2 terminals 我从2个终端的Windows cmd开始

server: octo.py server wordcount.py It seems the server side started without problem 服务器:octo.py服务器wordcount.py似乎服务器端启动没有问题

client: octo.py client localhost It seems that python can't find the txt files I stored in the hw3data dir, so it says no work, sleeping. client:octo.py client localhost似乎python找不到我存储在hw3data目录中的txt文件,因此它说没有任何工作,正在休眠。 So anyone can help? 有谁可以帮忙吗?

The wordcount.py code is below wordcount.py代码如下

wordcount.py wordcount.py

server 服务器

import glob

text_files=glob.glob('C:/Python27/octopy-0.1/hw3data/*.txt')

def file_contents(file_name):
    f=open(file_name)
    try:
        return f.read()
    finally:
        f.close()

source=dict((file_name,file_contents(file_name)) for file_name in text_files)

f=open('outfile','w')
def final(key,value):
    print key,value
    f.write(str((key,value)))

client 客户

def mapfn(key,value):

      for line in value.splitlines():

          for word in line.split():

               yield word.lower(),1

def reducefn(key,value):

       return key,len(value)

Verify if your data files have a ".txt" in his names. 验证您的数据文件的名称中是否包含“ .txt”。 I'm current working in this problem to my homework #3. 我目前正在作业#3中解决此问题。 Good Luck! 祝好运!

Change the following code 更改以下代码

text_files=glob.glob('C:/Python27/octopy-0.1/hw3data/*.txt')

to

text_files=glob.glob('C:/Python27/octopy-0.1/hw3data/*')

and try. 并尝试。 I guess the files in the folder do not have the extensions. 我猜文件夹中的文件没有扩展名。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM