简体   繁体   English

气流可以缩放多少?

[英]How much can Airflow scale?

Has anyone reported how much they've been able to get Airflow to scale at their company? 有没有人报告他们能够在公司扩大Airflow的规模? I'm looking at implementing Airflow to execute 5,000+ tasks that will each run hourly, and someday scale that up to 20,000+ tasks. 我正在考虑实施Airflow来执行5,000多个任务,每个任务每小时运行一次,有朝一日可以扩展到20,000多个任务。 In examining the scheduler it looks like that might be a bottleneck since only one instance of it can run, and I'm concerned with that many tasks the scheduler will struggle to keep up. 在检查调度程序时,由于只有一个实例可以运行,因此这似乎是一个瓶颈,而且我担心调度程序要跟上很多任务。 Should I be? 我可以做?

We run thousands of tasks a day at my company and have been using Airflow for the better part of 2 years. 我们每天在公司中执行数千个任务,并且在过去两年的大部分时间里一直在使用Airflow。 These dags run every 15 minutes and are generated through config files that can change at any time (fed in from a UI). 这些中断每15分钟运行一次,并通过可随时更改的配置文件生成(从用户界面中获取)。

The short answer - yes, it can definitely scale to that, depending on your infrastructure. 简短的答案-是的,根据您的基础架构,它肯定可以扩展到该规模。 Some of the new 1.10 features should make this easier than the version of 1.8 we run that runs all those tasks. 1.10的一些新功能应该比我们运行所有这些任务的1.8版本更容易。 We ran this on a large Mesos/DCOS that took a good deal of fine tuning to get to a stable point. 我们在大型Mesos / DCOS上运行此程序,并进行了大量的微调以使其达到稳定点。

The long answer - although it can scale to that, we've found that a better solution is multiple Airflow instances with different configurations (scheduler settings,number of workers, etc.) optimized for the types dags they are running. 长话大说-尽管它可以扩展到这个规模,但我们发现更好的解决方案是针对其运行的dags类型优化了多个具有不同配置(调度程序设置,工作人员数量等)的Airflow实例 A set of DAGs that run long running machine learning jobs should be hosted on an Airflow instance that is different from the ones running 5 minute ETL jobs. 运行长时间运行的机器学习作业的一组DAG应该承载在与运行5分钟的ETL作业的实例不同的Airflow实例上。 This also makes it easier for different teams to maintain the jobs they are responsible for and makes it easier to iterate on any fine tuning that's needed. 这也使不同的团队更容易维护他们所负责的工作,并且更容易迭代所需的任何微调。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM