[英]How do I export a CSV file into Hive table with records value with comma?
Input File 输入文件
11/24/2013,bank of nyc,withdrawl,deposit,in progress
11/16/2014,bank of dc,opeanig,closing,resolved
I want them in the table 我要他们在桌子上
Date Bank name issue status
11/24/2013 bank of nyc withdrawl,deposit in progress
11/16/2014 bank of dc opeanig,closing resolved
Well, the problem is that the comma is not escaped...how is hive supposed to know if a comma is part of a string, or a separator ? 好吧,问题在于逗号没有被转义...蜂巢应该如何知道逗号是字符串的一部分还是分隔符?
It is possible if you only can have extra commas in one known column, like in this case, the third. 如果只可能在一个已知列中包含多余的逗号,例如本例中的第三列。 You can then write a regular expression that takes anything between the second comma and the last and use it with the Regexp serde.
然后,您可以编写一个在第二个逗号和最后一个逗号之间包含任何内容的正则表达式,并将其与Regexp serde一起使用。 This works for your example, given that only 'issue' may have commas.
鉴于只有“问题”可能有逗号,因此这对您的示例有用。
CREATE TABLE csvsample(
date STRING,
bank_name STRING,
issue STRING,
status STRING
) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
"input.regex" = "^([^,]+),([^,]+),(.+),([^,]+)$") ;
hive> select * from csvsample;
OK
11/24/2013 bank of nyc withdrawl,deposit in progress
11/16/2014 bank of dc opeanig,closing resolved
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.