简体   繁体   中英

how to run sc.textFile() inside a foreach loop and do a union on it?

so this is what I have been trying and I'm a newbie here working with spark!

I'm trying to execute this code

val ii=sc.parallelize(Seq(("e.txt"),("r.txt"))).foreach{i => sc.textFile(i)}

but I'm getting "Nullpointer exception"

Thanks!

You can just add multiple files to the sc.textFile . You should not use the sc inside of a map operation. The map function will be distributed to the different executors, and the sc lives in the driver. Therefore it will throw a Nullpointer exception.

a.txt contents:

a.txt:line1
a.txt:line2

b.txt contents:

b.txt:line1
b.txt:line2

Spark allows you to add more files in the same operation:

scala> sc.textFile("a.txt,b.txt").collect()
res1: Array[String] = Array(a.txt:line1, a.txt:line2, b.txt:line1, b.txt:line2)

Hope this helps and have fun with Spark!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM