简体   繁体   中英

Update a mutable List within a foreach loop scala spark

I need to update a mutable list with the content of a directory in HDFS, I have the following code witch in spark-shell works but inside an script it doesn't:

import org.apache.hadoop.fs._
import org.apache.spark.deploy.SparkHadoopUtil

var listOfFiles= scala.collection.mutable.ListBuffer[String]()

val hdfs_conf = SparkHadoopUtil.get.newConfiguration(sc.getConf)
    val hdfs = FileSystem.get(hdfs_conf)
    val sourcePath = new Path(filePath)  

 hdfs.globStatus( sourcePath ).foreach{ fileStatus =>
      val filePathName = fileStatus.getPath().toString();
      val fileName = fileStatus.getPath().getName();
      listOfFiles.append(fileName)
  } 

listOfFiles.tail

any help, when running it launches an exception telling that listOfFiles is empty.

You should avoid using mutable collection.

Try:

val listOfFiles = hdfs.globStatus(sourcePath).map{ fileStatus =>
      fileStatus.getPath().getName();
  }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM