简体   繁体   中英

How to remove elements of csv file using rdd in scala?

How to remove elements of CSV file using RDD in Scala?

val textRDD = sc.textFile("file:/home/bharathi/bhaskar/sample.tab")

I have values in the sample.tab like this

A   B   C   D
1   2   3   4
5   6   7   8
9   10  11  12

I have to delete the second row and show the output

Assuming your second row is 5 6 7 8 , and you don't have any blank lines between your rows. You can use zipWithIndex to assign index to each row and then filter out the row that you don't want based on the index.

textRDD.zipWithIndex.filter(_._2 != 2).map(_._1).foreach(println)

It will print

A B C D
1 2 3 4
9 10 11 12

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM