简体繁体中英

Spark - Scala - Join RDDS (csv) files

原文 2015-09-19 18:13:33 4 1 scala/ csv/ apache-spark

I'm coming in and learning scala, as I am in the initial steps, appeared a demand and need to know how to join in two fields like a relational database.

Example:

Table 1 ( csv )

zip_type, primary_city, acceptable_cities, unacceptable_cities

Example:

Table 2 ( csv )

GEO.id, GEO.id2, GEO.display-label, VD01

Question:

I want to join Column1 (zip type)Table1 with Column2(GEO.id2)Table2.

Currently I:

Created an RDD with my CSV file
Processed each line using the CSV parser but I have a little trouble to making the join.

What do I need to do next?

1 answers

To make join you need pair-rdds with same key column. Consider transforming RDD-1 into RDD of tuple (K, V) with zip-type as key, similarly RDD-2 with GEO.id2 as key.

Join on two RDDs using Scala in Spark

Creating RDDs and outputting to text files with Scala and Spark

Compilation issue in spark scala script containing join on RDDs with 2 columns

Merge two RDDs in Spark Scala

Spark Scala - textFile() and sequenceFile() RDDs

Scala Spark RDDs, DataSet, PairRDDs and Partitoning

scala.MatchError: null on spark RDDs

combining two RDDs by values in scala spark

Merging RDDs using Scala Apache Spark

Joining two (paired) RDDs in Scala, Spark

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Join on two RDDs using Scala in Spark Creating RDDs and outputting to text files with Scala and Spark Compilation issue in spark scala script containing join on RDDs with 2 columns Merge two RDDs in Spark Scala Spark Scala - textFile() and sequenceFile() RDDs Scala Spark RDDs, DataSet, PairRDDs and Partitoning scala.MatchError: null on spark RDDs combining two RDDs by values in scala spark Merging RDDs using Scala Apache Spark Joining two (paired) RDDs in Scala, Spark

Related Tags

Spark - Scala - Join RDDS (csv) files

Question

1 answers

solution1 0 ACCPTED 2015-09-19 18:51:28

solution1
0 ACCPTED 2015-09-19 18:51:28