As I am connecting to the kafka topic with spark and creating the dataframe and then storing into Hudi: I am getting the following exception: To ...
As I am connecting to the kafka topic with spark and creating the dataframe and then storing into Hudi: I am getting the following exception: To ...
I have multiple HUDI tables with differing column names and I built a view on top of it to standardize the column names. When this view is read from A ...
I currently have a DynamoDB stream configured which is inputing streams into Kinesis Data streams whenever insertion/updation happens and subsequent ...
Can we run write operation type Upsert and Delete at the same time and same table? Is Apache Hudi meta get corrupted?? Please help here to do the sa ...
I'm trying Apache Hudi with Spark by a very simple demo: There are about 10 parquet files in the directory; their total size is 1GB, about 6 millio ...
Technical background: I am getting tables data from kafka and putting it into hudi and hive tables using spark. I am using AWS EMR. I want to encrypt ...
I am pushing some initial bulk data into a hudi table, and then every day, I write incremental data into it. But if back data arrives, then the latest ...
is Dataproc-noob again. My main goal is to ingest the tables from on-premise sources, store them as a Parquet-file in a Cloud Storage bucket and crea ...
Is there any guide to deploy Apache Hudi on a Dataproc Cluster? i´m trying to deploy via Hudi Quick Start Guide but i can´t. Spark 3.1.1 Python 3.8. ...
Hudi by default basing ingestion timeline on current time. I want to change this behavior and use my own datetime field during the ingestion. I want t ...
The Problem I'm trying to write a hudi table into minio s3 bucket by flink SQL, but it fails. The hudi table is created, but only contains meta data ...
I see the official document, there are no samples about inserting complex types like struct and map. So, what's the grammar? My table definition: s ...
For every update in SQL server, debezium generates event payload with 'after' and 'before'. I want to get rid of 'before' without flattening the paylo ...
I'm new using hudi and I have a problem. I'm working with an EMR in AWS with pyspark, Kafka and what I want to do is to read a topic from the Kafka cl ...
We have an AWS Glue job that is attempting to read data from an Athena table that is being populated by HUDI. Unfortunately, we are running into an er ...
I'm trying to do incremental, snapshot, and time travel queries using spark-sql with hudi, but the only way that I can find to do this is creating a D ...
I have setup Glue Interactive sessions locally by following https://docs.aws.amazon.com/glue/latest/dg/interactive-sessions.html However, I am not abl ...
I am working on a Flink streaming job where I need to upsert data in the Hudi table. I am using merge into a query to upsert data in the Hudi table. ...
Problem Statement: There is no upsert to database feature in Apache Spark, instead we have to overwrite the entire table. But Apache Hudi can be used ...