I am trying to join tow streaming data in Spark structured streaming. Data structures are as follows: Table: CardHolder CardNo ...
I am trying to join tow streaming data in Spark structured streaming. Data structures are as follows: Table: CardHolder CardNo ...
I'm trying to read in data from kafka using structured streaming, but the program doesn't seem to be getting any of it. This code doesn't print any r ...
I am trying to get a kafka topic into spark dataframe so the code is following: I'm trying to execute the code by using spark-submit: spark-submit ...
Supposed we have an application that reads from X partition topic, does some filtering on the data then saves it into storage (no complex shuffling lo ...
I am using AutoLoader in databricks. However when I save the stream as a delta table, the generated table is NOT delta. Why is the generated tabl ...
i want to use restart policy as Always. When my spark streaming app fails it should start automatically. i have tried setting policy in podTemplate b ...
I have a PySpark streaming pipeline which reads data from a Kafka topic, data undergoes thru various transformations and finally gets merged into a da ...
In my spark application(Java), I am trying to read the incoming JSON data, sent thru the socket. The data is in string format. eg. "{"deviceId": "1", ...
I have a use case to download content from an HTTP source and Ingest it to HDFS using python, the data available in the source is not live data, It ha ...
I need to know in read stream how can I start reading files from a specific folder.In my storage account data is coming from 2019 yyyymmdd format, I n ...
I am using below spark streaming Scala code for consuming real time kafka message from producer topic. But the issue is sometime my job is failed due ...
I want to be able to read data from a kafka topic, group it by a column and aggregate/reduce the sum of a given column. If the timestamp from message ...
I have csv data coming as DStreams from traffic counters. Sample is as follows I want to calculate average speed (for each location) by vehicle cat ...
I have to compute a cumulative sum on a value column by group from the beginning of the time series with a daily output. If I do with a batch, it sho ...
I'm trying to stream data using Apache Kafka and Spark, but I get an error in line 24 of my code saying "Cannot resolve method "createStream" in "Kafk ...
For a particular use case we are using spark structured streaming, but the process is not efficient and stable. Aggregation stateful operation is the ...
I'm trying to use Kafka ByteArrayDeserializer to read avro records from a Kafka topic. But getting below exception. My Code: Any help is appreti ...
I have been reading this article - https://www.databricks.com/session_na20/native-support-of-prometheus-monitoring-in-apache-spark-3-0 and it has been ...
I have a spark streaming query running in databricks. While loading data from a kafka topic to delta lake, the cell output while running displays "Com ...
I have a spark process that processes about a million signals per job and joins those rows with a giant table (5 billion rows). The entire table in me ...