简体   繁体   English

Spark 结构化流式最佳 VM

[英]Spark structured streaming best VMs

I was hoping to ask if anyone found the best VM to use for Databricks clusters when running spark streaming.我想问是否有人在运行火花流时找到了用于 Databricks 集群的最佳 VM。

I was testing out the Fv2 series (F32_v2), however I found out that most of the jobs have an issue with memory spill.我正在测试 Fv2 系列 (F32_v2),但我发现大多数作业都存在 memory 溢出问题。 With that said would it make sense to use more memory optimized clusters or add more compute VMs?话虽如此,使用更多 memory 优化集群或添加更多计算 VM 是否有意义?

We are looking to see how we can improve the code, but as a general rule have you found some VM types work better with streaming jobs and some that do not work well (for example the L-series vs E-series vs F series).我们正在寻找如何改进代码,但作为一般规则,您是否发现某些 VM 类型更适合流式作业,而有些则不能很好地工作(例如 L 系列 vs E 系列 vs F 系列) .

Thank you in advance先感谢您

It might depend on your use case.这可能取决于您的用例。 If you need more parallel processing - lets say you have more partitions on your message queue from you pull the data, you can go for compute optimized node and have more cores running in parallel and pulling data from message queue.如果您需要更多的并行处理 - 假设您的消息队列上有更多的分区从您提取数据,您可以 go 用于计算优化节点,并让更多的内核并行运行并从消息队列中提取数据。 If you feel your workload is memory intensive, you can go for memory optimized VMs.如果您觉得您的工作负载是 memory 密集型,您可以 go 为 memory 优化 VM。

This page has details around the benchmarking tests conducted on databricks and it might help you get some fair idea - https://www.databricks.com/blog/2017/10/11/benchmarking-structured-streaming-on-databricks-runtime-against-state-of-the-art-streaming-systems.html此页面包含有关在数据块上进行的基准测试的详细信息,它可能会帮助您获得一些公平的想法 - https://www.databricks.com/blog/2017/10/11/benchmarking-structured-streaming-on-databricks-runtime -反对最先进的流媒体系统。html

Github repo with.dbc files for benchmarking - https://github.com/databricks/benchmarks Github repo 带有用于基准测试的 .dbc 文件 - https://github.com/databricks/benchmarks

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM