[英]Why we need TFX if we have airflow for orchestration
I still don't get why we need TFX.我仍然不明白为什么我们需要 TFX。 TFX will convert your defined pipeline to Airflow DAG and run it on airflow, I could just write my pipelines in python and use Airflow's PythonOperator to build a pipeline directly right?
TFX 会将您定义的管道转换为 Airflow DAG 并在 airflow 上运行它,我可以在 python 中编写我的管道并直接使用 Airflow 的 PythonOperator 来构建管道吗? why bother learning another wrapper on top of it?
为什么还要在上面学习另一个包装器? What else TFX offers that cannot be done by just using airflow+TF+Spark/Beam
TFX 还提供了哪些仅使用气流+TF+Spark/Beam 无法完成的功能
I could just write my pipelines in python and use Airflow's PythonOperator to build a pipeline directly right?
我可以在 python 中编写我的管道并使用 Airflow 的 PythonOperator 直接构建管道,对吗?
You can!你可以! Depending on how you define a pipeline of course.
当然,这取决于您如何定义管道。
Here is the definition of TFX, from it's guide :这是 TFX 的定义,来自它的指南:
" TFX is a Google-production-scale machine learning (ML) platform based on TensorFlow. It provides a configuration framework and shared libraries to integrate common components needed to define, launch, and monitor your machine learning system. " “ TFX 是一个基于 TensorFlow 的 Google 生产规模机器学习 (ML) 平台。它提供了一个配置框架和共享库来集成定义、启动和监控机器学习系统所需的通用组件。 ”
And to make a Production ML System并制作一个生产机器学习系统
according to engineers at Tensorflow .根据Tensorflow 工程师的说法。
So, if you can define a whole system where you are able cover all these steps in Airflow DAG's, sure you don't need TFX.因此,如果您可以定义一个能够涵盖 Airflow DAG 中所有这些步骤的整个系统,那么您肯定不需要 TFX。
PS : PS:
It comes down to the problem you are trying to solve.它归结为您要解决的问题。 Here are some questions to think about.
这里有一些问题需要思考。
Do you have the data needed at hand, is it valuable?你手头有需要的数据吗,有价值吗?
Do you need to adjust it before giving it to a model?在将其提供给 model 之前是否需要对其进行调整?
Which model should you use?您应该使用哪个 model?
Are you going to re-train the model as you get new data?您是否要在获得新数据时重新训练 model? If so what is the period of this process should be?
如果是这样,这个过程的周期应该是多少?
As you are doing inference - or serving your model - how are you going to use the predicted results?当您进行推理或为您的 model 提供服务时,您将如何使用预测结果?
What is your threshold for evaluating the success of your service?您评估服务成功的门槛是多少? What metrics should you use?
您应该使用哪些指标?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.