I have a column called priority among other columns in a file and contains numbers For ex: 1, 2, 3, 4, 5, 6 etc. The file data is as follows I want ...
I have a column called priority among other columns in a file and contains numbers For ex: 1, 2, 3, 4, 5, 6 etc. The file data is as follows I want ...
I have an employee hive table with column Name , Department , City and i want to retrieve the data based on names of the employee using IN operation i ...
I would like to know where the hive-site.xml file configuration is in a Cloudera distribution. Mainly because I would like to know where I can find o ...
I can read a table, defined in the glue data catalogue from a glue job with the glueContext. However, if I want to read the exact same table with hive ...
I am running pyspark in my PC (windows 10) but I can not import HiveContext: How I should proceed to resolve it? ...
I am trying to list all the databases using HiveContext in Spark 1.6 but its giving me just the default database. from pyspark import SparkContext fr ...
I'm trying to run a spark-scala Self-Contained App in Oozie. Please note that I'm using CDH5.13 Quickstart VM with 20G of RAM (containing Cloudera Man ...
I have external hive table stored as Parquet, partitioned on a column say as_of_dt and data gets inserted via spark streaming. Now Every day new parti ...
I am new to spark and I am trying to join two tables present in hive from Scala code: however for the above join I got error : Is it a right way ...
I am using HiveContext to query a hive table on a hdfs cluster remotely through spark 1.6.0 and am able to do so successfully. However, when doing so ...
I am able to create a hive context programmatically on spark 1.6.0 using : This is working fine for me. In the same way, I want to create a hive co ...
I tried to create a function which would get the data from relational database and insert them into Hive table. Since I use Spark 1.6, I need to regis ...
I have a data-set of size 10 Petabytes. My current data is in HBase where I am using Spark HbaseContext but it is not performing well. Will it be us ...
I'm trying to use an instance of HiveContext in a Spark streaming application (1.6), but it fails with the following exception: java.lang.NullPoin ...
I am not able run hive queries using spark-submit command. But, the same is getting executed in spark-shell. I am using AWS EMR as the cluster. Belo ...
I'm facing the following problem: The problem here is the persistence of HiveContext (i.e if I do hctx._get_hive_ctx() it returns JavaObject id=Id) ...
spark-shell --packages com.databricks:spark-csv_2.11:1.2.0 1. using SQLContext ~~~~~~~~~~~~~~~~~~~~ 1. import org.apache.spark.sql.SQLContext 2. val s ...
I am having trouble to run a sql which loads data to partition table in hive context , I did set dynamic partition = true but still I am having issue. ...
We are reading data from a hive table with hiveContext using a spark dataframe. After doing some aggregations on the data we store this data into anot ...
Below is my hive/conf/hive-site.xml: I want to access Hive existing database and tables using spark-HiveContext. So added below lines to hive/conf/ ...