let's say we have a column with a number that increases a bit on a daily basis, but cannot predict the increase with good precision. For example (the ...
let's say we have a column with a number that increases a bit on a daily basis, but cannot predict the increase with good precision. For example (the ...
I'm using pydeequ with Spark 3.0.1 to perform some constraint checks on data. As for testing with the VerificationSuite, after calling VerificationRe ...
I'm trying to run the sample code for pattern check "hasPattern()" with PyDeequ and it fails with Exception The code: After run I recieve: on l ...
I have a case class : And i have a function which returns a collection of the above case class objects in the form of a Seq. I am adding the obj ...
I am using a library which is written by amazon in scala here The trait goes like this : I am trying to make a case object to store some informati ...
I am using pydeequ to run some checks on data, however it is not behaving as expected. One of my columns should contain any values between 0 and 1. Th ...
Spark Version - 3.0.1 Amazon Deequ version - deequ-2.0.0-spark-3.1.jar Im running the below code in spark shell in my local : ERROR: Can someon ...
I'm using deequ to write analyzer. My editor is showing me this warning and I'm not sure how to fix this warning. On line this: I get this warning ...
How to configure the environment to submit a PyDeequ job to a Spark/YARN (client mode) from a Jupyter notebook. There is no comprehensive explanation ...
I have just started with pydeequ and I want to create checks for spark dataframe that has ~1800 features. Now to know which checks I must perform, I d ...
I'm using PyDeequ for data quality and I want to check the uniqueness of a set of columns. There is a Check method hasUniqueness but I can't figure ho ...
So, I'm using Amazon Deequ in Spark, and I have a dataframe df with a column publish_date which is of type DateType. I simply want to check the follow ...
I am currently importing the dataset from an excel sheet which has a column name with a dot character like this "abc.xyz". I went through a couple of ...
So, I ran a simple Deequ check in Spark, that went something like this : Now, my result1 dataframe looks something like this: I'm confused betwe ...
So, I'm using Amazon Deequ in spark, and I have a dataframe 'df' with two columns being of type 'Long' or numeric. I simply want to check: value(colu ...
I am using Deequ on AWS GLUE, surprisingly when I was to run the hasMaxLength which is listed under Checks for the verificationSuite. I get the follow ...
We have Spark dataframes partitioned on multiple columns. For example, we have a partner column that can be Google, Facebook, and Bing. And we have a ...
I am trying to run and test amazon deequ library locally but am repeatedly getting the class not found error for various examples. exact error or ...
I want to introduce data quality testing (empty fields/max-min values/regex/etc...) into my pipeline which will essentially consume kafta topics testi ...
I am working on AWS Glue and leveraging pyspark API for my ETL. I believe if I need to use Amazon Deequ I need to switch to Scala. However I still wan ...