简体   繁体   中英

Twitter API with Structured Spark Streaming

I am trying to access the json data from tweets in my kafka topic.In my spark structured streaming while creating schema is it necessary to explicitly specify each and every key from the twitter API.Can i not access the only ones which i want to analyse like the text field alone?

While recommended, the schema is optional. You should be able to do something like this

kafkaDf
    .select(col("value").cast("string").as("value")) 
    .select(get_json_object(col("value"), "$.text"))

https://spark.apache.org/docs/latest/api/sql/index.html#get_json_object

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM