管道在 kedro 中找不到节点

Question

我正在关注管道教程，创建所有需要的文件，使用kedro run --node=preprocessing_data启动 kedro 但遇到这样的错误消息：

ValueError: Pipeline does not contain nodes named ['preprocessing_data'].

如果我在没有node参数的情况下运行 kedro，我会收到

kedro.context.context.KedroContextError: Pipeline contains no nodes

文件内容：

src/project/pipelines/data_engineering/nodes.py
def preprocess_data(data: SparkDataSet) -> None:
    print(data)
    return

src/project/pipelines/data_engineering/pipeline.py
def create_pipeline(**kwargs):
    return Pipeline(
        [
            node(
                func=preprocess_data,
                inputs="data",
                outputs="preprocessed_data",
                name="preprocessing_data",
            ),
        ]
    )

src/project/pipeline.py
def create_pipelines(**kwargs) -> Dict[str, Pipeline]:
    de_pipeline = de.create_pipeline()
    return {
        "de": de_pipeline,
        "__default__": Pipeline([])
    }

Answer 1

我认为您需要在__default__中设置管道。 例如

def create_pipelines(**kwargs) -> Dict[str, Pipeline]:
    de_pipeline = de.create_pipeline()
    return {
        "de": data_engineering_pipeline,
        "__default__": data_engineering_pipeline
    }

然后kedro run --node=preprocessing_data为我工作。

Answer 2

Mayurc 是正确的，没有节点，因为您的__default__管道是空的。 另一种选择是仅使用 cli 运行de管道。

kedro run --pipeline de

您可以在运行命令的帮助文本中找到此选项和更多内容。

$ kedro run --help

Usage: kedro run [OPTIONS]

  Run the pipeline.

Options:
  --from-inputs TEXT        A list of dataset names which should be used as a
                            starting point.
  --from-nodes TEXT         A list of node names which should be used as a
                            starting point.
  --to-nodes TEXT           A list of node names which should be used as an
                            end point.
  -n, --node TEXT           Run only nodes with specified names.
  -r, --runner TEXT         Specify a runner that you want to run the pipeline
                            with.
                            This option cannot be used together with
                            --parallel.
  -p, --parallel            Run the pipeline using the `ParallelRunner`.
                            If
                            not specified, use the `SequentialRunner`. This
                            flag cannot be used together
                            with --runner.
  -e, --env TEXT            Run the pipeline in a configured environment. If
                            not specified,
                            pipeline will run using environment
                            `local`.
  -t, --tag TEXT            Construct the pipeline using only nodes which have
                            this tag
                            attached. Option can be used multiple
                            times, what results in a
                            pipeline constructed from
                            nodes having any of those tags.
  -lv, --load-version TEXT  Specify a particular dataset version (timestamp)
                            for loading.
  --pipeline TEXT           Name of the modular pipeline to run.
                            If not set,
                            the project pipeline is run by default.
  -c, --config FILE         Specify a YAML configuration file to load the run
                            command arguments from. If command line arguments
                            are provided, they will
                            override the loaded ones.
  --params TEXT             Specify extra parameters that you want to pass
                            to
                            the context initializer. Items must be separated
                            by comma, keys - by colon,
                            example:
                            param1:value1,param2:value2. Each parameter is
                            split by the first comma,
                            so parameter values are
                            allowed to contain colons, parameter keys are not.
  -h, --help                Show this message and exit.

发布第二个答案，因为完整的帮助 output 不适合评论。

管道在 kedro 中找不到节点

问题描述

2 个解决方案

解决方案1
8 已采纳 2020-02-23 03:14:40

解决方案2
5 2020-02-25 19:24:23

管道在 kedro 中找不到节点

问题描述

2 个解决方案

解决方案1 8 已采纳 2020-02-23 03:14:40

解决方案2 5 2020-02-25 19:24:23

解决方案1
8 已采纳 2020-02-23 03:14:40

解决方案2
5 2020-02-25 19:24:23