简体   繁体   English

无法加载。csv 数据从 hdfs 到 Z53EB3DCFBB4C210BCD4ZFE1A985D7C4 中的 Hive 表中

[英]Unable to load .csv data from hdfs into Hive table in Hadoop

I am trying to load csv files into a Hive table.我正在尝试将 csv 文件加载到 Hive 表中。 I need to have it done through HDFS.我需要通过 HDFS 完成它。

My end goal is to have the hive table also connected to Impala tables, which I can then load into Power BI, but I am having trouble getting the Hive tables to populate.我的最终目标是让 hive 表也连接到 Impala 表,然后我可以将其加载到 Power BI 中,但我无法让 Hive 表填充。

I create a table in the Hive query editor using the following code:我使用以下代码在 Hive 查询编辑器中创建了一个表:

CREATE TABLE IF NOT EXISTS dbname.table_name (
    time_stamp TIMESTAMP COMMENT 'time_stamp',
    attribute STRING COMMENT 'attribute',
    value DOUBLE COMMENT 'value',
    vehicle STRING COMMENT 'vehicle',
    filename STRING COMMENT 'filename')

Then I check and see the LOCATION using the following code:然后我使用以下代码检查并查看 LOCATION:

SHOW CREATE TABLE dbname.table_name;

and find that is has gone to the default location: hdfs://our_company/user/hive/warehouse/dbname.db/table_name并发现它已转到默认位置:hdfs://our_company/user/hive/warehouse/dbname.db/table_name

So I go to the above location in HDFS, and I upload a few csv files manually, which are in the same five-column format as the table I created.所以我把go到HDFS上面的位置,我手动上传了几个csv文件,和我创建的表格一样的五列格式。 Here is where I expect this data to be loaded into the Hive table, but when I go back to dbname in Hive, and open up the table I made, all values are still null, and when I try to open in browser I get: Here is where I expect this data to be loaded into the Hive table, but when I go back to dbname in Hive, and open up the table I made, all values are still null, and when I try to open in browser I get:

DB Error AnalysisException: Could not resolve path: 'dbname.table_name' DB 错误 AnalysisException:无法解析路径:'dbname.table_name'

Then I try the following code:然后我尝试以下代码:

LOAD DATA INPATH 'hdfs://our_company/user/hive/warehouse/dbname.db/table_name' INTO TABLE dbname.table_name;

It runs fine, but the table in Hive still does not populate.它运行良好,但 Hive 中的表仍然没有填充。

I also tried all of the above using CREATE EXTERNAL TABLE instead, and specifying the HDFS in the LOCATION argument.我还尝试使用 CREATE EXTERNAL TABLE 代替上述所有方法,并在 LOCATION 参数中指定 HDFS 。 I also tried making an HDFS location first, uploading the csv files, then CREATE EXTERNAL TABLE with the LOCATION argument pointed at the pre-made HDFS location.我还尝试先创建一个 HDFS 位置,上传 csv 文件,然后使用指向预制 HDFS 位置的 LOCATION 参数创建外部表。

I already made sure I have authorization privileges.我已经确定我有授权权限。

My table will not populate with the csv files, no matter which method I try.无论我尝试哪种方法,我的表都不会填充 csv 文件。

What I am doing wrong here?我在这里做错了什么?

I was able to solve the problem using:我能够使用以下方法解决问题:

CREATE TABLE IF NOT EXISTS dbname.table_name (
    time_stamp STRING COMMENT 'time_stamp', 
    attribute STRING COMMENT 'attribute', 
    value STRING COMMENT 'value', 
    vehicle STRING COMMENT 'vehicle', 
    filename STRING COMMENT 'filename') 
    ROW FORMAT DELIMITED 
    FIELDS TERMINATED BY ',' 
    STORED AS TEXTFILE

and

LOAD DATA INPATH 'hdfs://our_company/user/hive/warehouse/dbname.db/table_name' OVERWRITE INTO TABLE dbname.table_name;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM