Twitter API with Structured Spark Streaming

Question

I am trying to access the json data from tweets in my kafka topic.In my spark structured streaming while creating schema is it necessary to explicitly specify each and every key from the twitter API.Can i not access the only ones which i want to analyse like the text field alone?

Answer 1

While recommended, the schema is optional. You should be able to do something like this

kafkaDf
    .select(col("value").cast("string").as("value")) 
    .select(get_json_object(col("value"), "$.text"))

https://spark.apache.org/docs/latest/api/sql/index.html#get_json_object

Twitter API with Structured Spark Streaming

Question

1 answers

solution1
0 2022-05-05 22:44:34

Twitter API with Structured Spark Streaming

Question

1 answers

solution1 0 2022-05-05 22:44:34

solution1
0 2022-05-05 22:44:34