[英]Pydoop mapreduce “AttributeError: module 'wordcount_minimal' has no attribute '__main__'”
I installed Pydoop
and am trying to run MapReduce
jobs. 我安装了
Pydoop
并试图运行MapReduce
作业。 Just to do a dry run, I tried executing the word count examples wordcount_minimal.py
and wordcount_full.py
. 只是为了进行试运行,我尝试执行单词计数示例
wordcount_minimal.py
和wordcount_full.py
。 Both of them hang at the map phase. 他们两个都挂在地图阶段。 In the end of the
stderr
, I find this message as per the script I run: 在
stderr
的末尾,我根据运行的脚本找到此消息:
module 'wordcount_minimal' has no attribute ' main '
模块'wordcount_minimal'没有属性' main '
or 要么
module 'wordcount_full' has no attribute ' main '
模块'wordcount_full'没有属性' main '
I executed the job using the command: 我使用以下命令执行了作业:
pydoop submit --upload-file-to-cache wordcount_full.py wordcount_full hdfs_input_dir hdfs_output_dir
pydoop提交-上传文件到缓存wordcount_full.py wordcount_full hdfs_input_dir hdfs_output_dir
Unable to find the reason behind this. 无法找到其背后的原因。 Any idea what could be the reason?
知道是什么原因吗?
I was able to execute the example from the pydoop script
using the map
and reduce
functions and it completed successfully. 我能够使用
map
从pydoop script
执行示例并reduce
功能,并成功完成了该示例。 But with the pydoop submit
option, I have this issue. 但是,使用
pydoop submit
选项,我遇到了这个问题。 Not sure if I am missing something. 不知道我是否想念一些东西。
PS: I have a cluster with 2 nodes running Hortonworks HDP 2.6.5
. PS:我有一个集群,其中有2个节点正在运行
Hortonworks HDP 2.6.5
。 Pydoop
is installed on both of them. Pydoop
都安装在两者上。
By default, pydoop submit expects an entry point called __main__
, but you can modify this via --entry-point
. 默认情况下,pydoop Submit需要一个名为
__main__
的入口点,但是您可以通过--entry-point
进行修改。 For instance, if your code is: 例如,如果您的代码是:
class Mapper ...
class Reducer ...
def run():
pipes.run_task(pipes.Factory(Mapper, Reducer))
You can run it via pydoop submit --entry-point run ...
您可以通过
pydoop submit --entry-point run ...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.