简体   繁体   English

分析Pig / Hive编译器产生的Map-Reduce作业

[英]Analyzing Map-Reduce jobs produced by Pig/Hive compiler

Is there anyway to view the code for the Map-Reduce jobs that are produced by both Pig and Hive? 无论如何,是否可以查看由Pig和Hive生成的Map-Reduce作业的代码?

I understand that with Hive, I can view the abstract syntax tree, but it seems that it is not possible to access the actual Java code for the MR jobs. 我了解使用Hive可以查看抽象语法树,但是似乎无法访问MR作业的实际Java代码。 Am I mistaken in that assumption? 我在这个假设中弄错了吗?

Pig and Hive don't generate any Java code, but plan. Pig和Hive不会生成任何Java代码,但是会计划。 The plan can be see using the explain command in the shell. 可以在外壳程序中使用explain命令查看该计划。 One way to generate the Java code from SQL is to use YSmart . 从SQL生成Java代码的一种方法是使用YSmart Note that there are a lot of changes happening in Hive to make it much faster. 请注意,Hive中发生了许多更改 ,以使其更快。

You can download and build the source code yourself. 您可以自己下载并构建源代码。

Then using a java IDE like eclipse you can remote debug you can inspect the code. 然后使用像eclipse这样的Java IDE可以进行远程调试,可以检查代码。 Although you might not have all dependencies in place and might not be able to inspect all objects you can see the plans in more detail than the Explain function. 尽管您可能没有适当的所有依赖关系,并且可能无法检查所有对象,但是您可以比Explain函数更详细地查看计划。

To allow remote debugging add the debug parameter to your hadoop bash script: 要允许远程调试,请在您的hadoop bash脚本中添加debug参数:

-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=1044

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM