For example I have the following table named "example":
name | age | address
'abc' | 12 | {'street':'1', 'city':'kl', 'country':'malaysia'}
'cab' | 15 | {'street':'5', 'city':'jakarta', 'country':'indonesia'}
In Spark I can do this:
scala> val test = sc.cassandraTable ("test","example")
and this:
scala> test.first.getString
and this:
scala> test.first.getMapString, String
which gives me all the fields of the address in the form of a map
Question 1 : But how do I use the "get" to access "city" information? Question 2 : Is there a way to falatten the entire table? Question 3 : how do I go about counting number of rows where "city" = "kl"?
Thanks
I'll answer 3 first because this may provide you an easier way to work with the data. Something like
sc.cassandraTable[(String,Map[String,String],Int)]("test","example")
.filter( _._2.getOrElse("city","NoCity") == "kl" )
.count
First, I use the type parameter [(String,Map[String,String],Int)]
on my cassandraTable
call to transform the rows into tuples. This gives me easy access to the Map without any casting. (The order is just how it appears when I made the table in my test environment you may have to change the ordering)
Second I say I would like to filter based on the _._2
which is shorthand for the second element of the incoming tuple. getOrElse
returns the value for the key "city" if the key exists and "NoCity" otherwise. The final equivalency checks what city it is.
Finally, I call count
to find out the number of entries in the city.
So the answer to 2 is that once you have a Map, you can call get("key") or getOrElse("key") or any of the standard Scala operations to get a value out of the map.
Depending on what you mean by "flatten" this can be a variety of things. For example if you want to return the entire table as an array to the driver (Not recommended since your RDD should be very big in production.) You can call collect
If you want to flatten the elements of your map into a tuple you can always do something like calling toSeq
and you will end up with a list of (key,value)
tuples. Feel free to ask another question if I haven't answered what you want with "flattening."
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.