简体   繁体   中英

Pydoop mapreduce “AttributeError: module 'wordcount_minimal' has no attribute '__main__'”

I installed Pydoop and am trying to run MapReduce jobs. Just to do a dry run, I tried executing the word count examples wordcount_minimal.py and wordcount_full.py . Both of them hang at the map phase. In the end of the stderr , I find this message as per the script I run:

module 'wordcount_minimal' has no attribute ' main '

or

module 'wordcount_full' has no attribute ' main '

I executed the job using the command:

pydoop submit --upload-file-to-cache wordcount_full.py wordcount_full hdfs_input_dir hdfs_output_dir

Unable to find the reason behind this. Any idea what could be the reason?

I was able to execute the example from the pydoop script using the map and reduce functions and it completed successfully. But with the pydoop submit option, I have this issue. Not sure if I am missing something.

PS: I have a cluster with 2 nodes running Hortonworks HDP 2.6.5 . Pydoop is installed on both of them.

By default, pydoop submit expects an entry point called __main__ , but you can modify this via --entry-point . For instance, if your code is:

class Mapper ...
class Reducer ...
def run():
    pipes.run_task(pipes.Factory(Mapper, Reducer))

You can run it via pydoop submit --entry-point run ...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM