简体   繁体   中英

Convert spark dataframe to Array[String]

Can any tell me how to convert Spark dataframe into Array[String] in scala.

I have used the following.

x =df.select(columns.head, columns.tail: _*).collect()

The above snippet gives me an Array[Row] and not Array[String]


df.select(columns: _*).collect.map(_.toSeq)

DataFrame to Array[String]


You can also use the following


If you have more columns then it is good to use the last one


If you are planning to read the dataset line by line, then you can use the iterator over the dataset:

 Dataset<Row>csv=session.read().format("csv").option("sep",",").option("inferSchema",true).option("escape, "\"").option("header", true).option("multiline",true).load(users/abc/....);

for(Iterator<Row> iter = csv.toLocalIterator(); iter.hasNext();) {
    String[] item = ((iter.next()).toString().split(",");    

The answer was provided by a user named cricket_007. You can use the following to convert Array[Row] to Array[String] :

x =df.select(columns.head, columns.tail: _*).collect().map { row => row.toString() }

Thanks, Bharath

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM