在Apache Pig中读取压缩（.xz）文件

Question

I am trying to read .xz file compressed using hadoop-xz codec using pig script. 我正在尝试使用Pig脚本读取使用hadoop-xz编解码器压缩的.xz文件。

The sample code i tried is, 我尝试的示例代码是

REGISTER hadoop-xz-1.4.jar
SET output.compression.enabled true;
SET output.compression.codec io.sensesecure.hadoop.xz.XZCodec;

msg = LOAD 'pigtest/newXZ.xz' USING PigStorage();
STORE msg INTO 'pigtest/output' USING PigStorage();
DUMP msg;

The result is still in a compressed format. 结果仍然是压缩格式。 Am i doing wrong or i have to use XZInputStream inside pig? 我做错了还是必须在XZInputStream内使用XZInputStream ？

The running environment is HortonWorks Sandbox 2.2 (Hue) 运行环境为HortonWorks Sandbox 2.2（Hue）

Answer 1

Depends on what you want to do. 取决于您要做什么。

It seems like you want to read an XZ file so I would assume you need to setup the input codec not the output one. 似乎您想读取一个XZ文件，所以我认为您需要设置输入编解码器而不是输出编解码器。

I'm not a PIG user but from what I gather it cannot easily handle custom compression (unlike Hive and Streaming for example). 我不是PIG用户，但据我收集，它不能轻松处理自定义压缩（例如，不同于Hive和Streaming）。

在Apache Pig中读取压缩（.xz）文件

问题描述

1 个解决方案

解决方案1
0 2016-07-05 14:01:20

在Apache Pig中读取压缩（.xz）文件

问题描述

1 个解决方案

解决方案1 0 2016-07-05 14:01:20

解决方案1
0 2016-07-05 14:01:20