简体   繁体   中英

Pig: get data from hive table and add partition as column

I have a partitioned Hive table that i want to load in a Pig script and would like to add partition as column also.

How can I do that?

Table definition in Hive:

CREATE EXTERNAL TABLE IF NOT EXISTS transactions
(
column1 string,
column2 string
)
PARTITIONED BY (datestamp string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION '/path';

Pig script:

%default INPUT_PATH '/path'

A = LOAD '$INPUT_PATH'
         USING PigStorage('|')
         AS (
         column1:chararray, 
         column2:chararray,
         datestamp:chararray  
         );

The datestamp column is not populated. Why is it so?

I am sorry I didn't get the part which says add partition as column also . Once created, partition keys behave like regular columns. What exactly do you need?

And you are loading the data directly from a given HDFS location, not as a Hive table. If you intend to use Pig to load/store data from/into a Hive table you should use HCatalog .

For example :

A = LOAD 'transactions' USING org.apache.hcatalog.pig.HCatLoader();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM