简体   繁体   English

使用 hive 查询进行数据解析

[英]Data parsing using hive query

I am building a pipeline through Azure data factory.我正在通过 Azure 数据工厂构建管道。 Input dataset is a csv file with column delimiter and the output dataset is also a csv file column delimiter.输入数据集是一个带有列分隔符的 csv 文件,输出数据集也是一个 csv 文件列分隔符。 The pipeline is designed with a HDinsight activity through hive query in the file with extension .hql.该管道设计为具有 HDinsight 活动,通过扩展名为 .hql 的文件中的 hive 查询。 The hive query is as follows hive查询如下

set hive.exec.dynamic.partition.mode=nonstrict;

DROP TABLE IF EXISTS Table1; 
CREATE EXTERNAL TABLE Table1 (
  Number string, 
  Name string, 
  Address string
)
ROW FORMAT DELIMITED FIELDS  TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE 
LOCATION '/your/folder/location'

SELECT * FROM Table1;

Below is the file format下面是文件格式

Number,Name,Address 
1,xyz,No 152,Chennai
2,abc,7th street,Chennai
3,wer,Chennai,Tamil Nadu

How do I data parse the column header with the data in the output dataset?如何使用输出数据集中的数据解析列标题?

As per my understanding, Your question is related to to csv file.根据我的理解,您的问题与 csv 文件有关。 You are putting csv file at table location and it consist of header.您将 csv 文件放在表格位置,它由标题组成。 If my understanding is correct, Please try below property in your table ddl.如果我的理解是正确的,请在您的表 ddl 中尝试以下属性。 I hope this will help you.我希望这能帮到您。

tblproperties ("skip.header.line.count"="1");

Thanks, Manu谢谢,马努

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM