简体   繁体   English

Spark是否处理资源管理?

[英]Does Spark handle resource management?

I'm new to Apache Spark and I started learning Scala along with Spark. 我是Apache Spark的新手,我开始与Spark一起学习Scala。 In this code snippet, does Spark handle closing the text file when it is done the program? 在此代码段中,Spark在完成程序后是否会处理关闭文本文件?

val rdd = context.textFile(filePath)

I know in Java when you opened a file you would have to close it with a try-catch-finally or try-with-resources. 我知道在Java中,当您打开文件时,必须使用try-catch-finally或try-with-resources将其关闭。

In this example, I am mentioning a text file but I want to know if Spark handles closing resources when they are done as RDDs can take multiple different types of data sets. 在此示例中,我提到的是一个文本文件,但是我想知道Spark在完成时是否处理关闭资源,因为RDD可以采用多种不同类型的数据集。

context.textFile() doesn't actually open the file, it just creates an RDD object. context.textFile()实际上并不打开文件,它只是创建一个RDD对象。 You can verify this experimentally by creating a textFile RDD for a file which doesn't exist- no error will be thrown. 您可以通过为不存在的文件创建textFile RDD来实验性地验证这一点-不会引发任何错误。 The file referenced by the RDD will only be opened, read, and closed when you call an action , which causes Spark to run the IO and data transformations which will result in the action you instructed. 仅当您调用action时,RDD引用的文件才会打开,读取和关闭,这将导致Spark运行IO和数据转换,这将导致您执行指示的操作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM