简体   繁体   中英

Unable to load .csv data from hdfs into Hive table in Hadoop

I am trying to load csv files into a Hive table. I need to have it done through HDFS.

My end goal is to have the hive table also connected to Impala tables, which I can then load into Power BI, but I am having trouble getting the Hive tables to populate.

I create a table in the Hive query editor using the following code:

CREATE TABLE IF NOT EXISTS dbname.table_name (
    time_stamp TIMESTAMP COMMENT 'time_stamp',
    attribute STRING COMMENT 'attribute',
    value DOUBLE COMMENT 'value',
    vehicle STRING COMMENT 'vehicle',
    filename STRING COMMENT 'filename')

Then I check and see the LOCATION using the following code:

SHOW CREATE TABLE dbname.table_name;

and find that is has gone to the default location: hdfs://our_company/user/hive/warehouse/dbname.db/table_name

So I go to the above location in HDFS, and I upload a few csv files manually, which are in the same five-column format as the table I created. Here is where I expect this data to be loaded into the Hive table, but when I go back to dbname in Hive, and open up the table I made, all values are still null, and when I try to open in browser I get:

DB Error AnalysisException: Could not resolve path: 'dbname.table_name'

Then I try the following code:

LOAD DATA INPATH 'hdfs://our_company/user/hive/warehouse/dbname.db/table_name' INTO TABLE dbname.table_name;

It runs fine, but the table in Hive still does not populate.

I also tried all of the above using CREATE EXTERNAL TABLE instead, and specifying the HDFS in the LOCATION argument. I also tried making an HDFS location first, uploading the csv files, then CREATE EXTERNAL TABLE with the LOCATION argument pointed at the pre-made HDFS location.

I already made sure I have authorization privileges.

My table will not populate with the csv files, no matter which method I try.

What I am doing wrong here?

I was able to solve the problem using:

CREATE TABLE IF NOT EXISTS dbname.table_name (
    time_stamp STRING COMMENT 'time_stamp', 
    attribute STRING COMMENT 'attribute', 
    value STRING COMMENT 'value', 
    vehicle STRING COMMENT 'vehicle', 
    filename STRING COMMENT 'filename') 
    ROW FORMAT DELIMITED 
    FIELDS TERMINATED BY ',' 
    STORED AS TEXTFILE

and

LOAD DATA INPATH 'hdfs://our_company/user/hive/warehouse/dbname.db/table_name' OVERWRITE INTO TABLE dbname.table_name;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM