简体   繁体   English

如何为 KubeFlow 管道构建镜像?

[英]How to build an image for KubeFlow pipeline?

I recently found out about kubeflow and kubeflow pipeline but it is not clear for me how to build an image from my python program.我最近发现了 kubeflow 和 kubeflow 管道,但我不清楚如何从我的 python 程序构建图像。

Let's assume that I have a simple python function that crops images:让我们假设我有一个简单的 python 函数来裁剪图像:

class Image_Proc:
    def crop_image(self, image, start_pixel, end_pixel):
        # crop
        return cropped_image

How shall I containerize this and use it in the KubeFlow pipeline?我应该如何将其容器化并在 KubeFlow 管道中使用它? Do I need to wrap it in an API (with Flask for example) Or do I need to connect to some media/data broker?我需要将它包装在 API 中(例如使用 Flask)还是需要连接到某些媒体/数据代理?

How KubeFlow pipeline sends input to this code and transfers the output of this code to the next step? KubeFlow 管道如何向此代码发送输入并将此代码的输出传输到下一步?

Basically you can follow the steps provided by Docker here to create Docker image and publish to Docker Hub (or you can build your own private docker registry, but I think it may be too much work for beginner).基本上你可以按照 Docker 提供的步骤在这里创建 Docker 镜像并发布到 Docker Hub(或者你可以构建自己的私有 docker 注册表,但我认为这对初学者来说可能太多了)。 Just roughly list steps:简单罗列一下步骤:

  1. Create Dockerfile.创建 Dockerfile。 In your Dockerfile, just specify several things: base image (for you case, just use python image from Docker), working directory and what commands to be executed when running this image在您的 Dockerfile 中,只需指定几项内容:基本映像(对于您而言,只需使用来自 Docker 的 python 映像)、工作目录以及运行此映像时要执行的命令
  2. Run your Image locally to make sure it works as expected (Install docker first if you haven't), then push to Docker Hub在本地运行你的镜像以确保它按预期工作(如果没有,请先安装 docker),然后推送到 Docker Hub
  3. Once published, you will have the image URL after publishing to Docker Hub, then use that url when you create pipelines in Kubeflow.发布后,您将拥有发布到 Docker Hub 后的图像 URL,然后在 Kubeflow 中创建管道时使用该 URL。

Also, you can read thisdoc to know how to create pipelines (Kubeflow pipeline is just argo workflow).此外,您可以阅读此文档以了解如何创建管道(Kubeflow 管道只是 argo 工作流)。 For your case, just fill in inputs and/or outputs sections of the step you want in the pipeline YAML file.对于您的情况,只需在管道 YAML 文件中填写您想要的步骤的inputs和/或outputs部分。

  1. You do not need to build images.您不需要构建图像。 For small to medium size components you can work on top of existing images.对于中小型组件,您可以在现有图像之上工作。 Check the lightweight components sample .检查轻量级组件示例 For python see Data passing in python components For non-python see Creating components from command-line programs对于python,请参阅python组件中的数据传递对于非python,请参阅从命令行程序创建组件

  2. KFP SDK has some support for building container images. KFP SDK 对构建容器镜像有一定的支持。 See the container_build sample.请参阅container_build示例。

  3. Read the official component authoring documentation .阅读官方组件创作文档

Let's assume that I have a simple python function that crops images:让我们假设我有一个简单的 python 函数来裁剪图像:

You can just create a component from a python function like this:您可以像这样从 python 函数创建一个组件:

from kfp.components import InputPath, OutputPath, create_component_from_func

# Declare function (with annotations)
def crop_image(
    image_path: InputPath(),
    start_pixel: int,
    end_pixel: int,
    cropped_image_path: OutputPath(),
):
    import some_image_lib
    some_image_lib.crop(image_path, start_pixel, end_pixel, cropped_image_path)

# Create component
crop_image_op = create_component_from_func(
  crop_image,
  # base_image=..., # Optional. Base image that has most of the packages that you need. E.g. tensorflow/tensorflow:2.2.0
  packages_to_install=['some_image_lib==1.2.3'],
  output_component_file='component.yaml', # Optional. Use this to share the component between pipelines, teams or people in the world
)

# Create pipeline
def my_pipeline():
    download_image_task = download_image_op(...)

    crop_image_task = crop_image_op(
        image=download_image_task.output,
        start_pixel=10,
        end_pixel=200,
    )

# Submit pipeline
kfp.Client(host=...).create_run_from_pipeline_func(my_pipeline, arguments={})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 气流和 Kubeflow 管道有什么区别? - What are the differences between airflow and Kubeflow pipeline? Sklearn 管道:如何构建 kmeans,聚类文本? - Sklearn Pipeline: How to build for kmeans, clustering text? 如何使用专用于 GCP 的 TFX SDK 实现 Kubeflow“运行参数”? - How do I implement the Kubeflow "Run Paramters" with the TFX SDK specialized for GCP? 如何使用 FeatureUnion 和 Pipeline 正确构建包含文本和数字数据的 SGDClassifier? - How to properly build a SGDClassifier with both text and numerical data using FeatureUnion and Pipeline? Tensorflow CNN图像增强管道 - Tensorflow CNN image augmentation pipeline 如何在Azure中构建图像分类数据集? - How to build an image classification dataset in Azure? 如何构建管道以细粒度方式找到每列的最佳预处理? - How to build a pipeline finding the best preprocessing per column in a fine-grained fashion? Python中图像增强的图像增强管道错误 - Error in Image Augmentation Pipeline for Image Augmentation in Python 如何访问 FastText 分类器管道? - How to access to FastText classifier pipeline? Kubeflow 是否有助于以分布式方式运行 ML - Does Kubeflow helps to run ML in a distributed manner
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM