简体   繁体   English

如何从 GAE Python 上的 cron 启动 mapreduce 作业

[英]How to start mapreduce job from cron on GAE Python

I have mapreduce job defined in mapreduce.yaml:我在 mapreduce.yaml 中定义了 mapreduce 作业:

mapreduce:
- name: JobName 
  mapper:
    input_reader: google.appengine.ext.mapreduce.input_readers.DatastoreInputReader
    handler: handler_name
    params:
    - name: entity_kind
      default: KindName

How to start it from cron?如何从 cron 启动它? Is there some url that can run it?有没有可以运行的url?

You can start a mapreduce task from any kind of AppEngine handler using control.py您可以使用control.py从任何类型的 AppEngine 处理程序启动 mapreduce 任务

from mapreduce import control

mapreduce_id = control.start_map(
    "My Mapper",
    "main.my_mapper",
    "mapreduce.input_readers.DatastoreInputReader",
    {"entity_kind": "models.MyEntity"},
    shard_count=10)

Yes, if you look at the Getting Started page, it shows that you set the URL in your app.yaml :是的,如果您查看Getting Started页面,它显示您在 app.yaml 中设置了app.yaml

handlers:
- url: /mapreduce(/.*)?
  script: mapreduce/main.py
  login: admin

You then can just cron it in the usual App Engine fashion, which in this example would be writing a cron.yaml like this:然后,您可以以通常的 App Engine 方式对其进行 cron 执行,在本例中将编写一个cron.yaml ,如下所示:

cron:
- description: daily summary job
  url: /mapreduce
  schedule: every 24 hours

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM