为什么hive试图在hdfs中写入/ user？

Question

Working with a simple HiveQL query that looks like this: 使用如下所示的简单HiveQL查询：

SELECT event_type FROM {{table}} where dt=20140103 limit 10;

The {{table}} part is just interpolated via the runner code im using via Jinja2. {{table}}部分仅通过使用via Jinja2的转轮代码进行插值。 I'm running my query using the -e flag on the hive command line using subprocess.Popen from python. 我正在使用python中的subprocess.Popen在hive命令行上使用-e标志运行我的查询。

For some reason, this setup is attempting to write into the regular /user directory in HDFS? 出于某种原因，此设置是否尝试写入HDFS中的常规/user目录？ Sudoing the command has no effect. 发出命令无效。 The error produced is as follows: 产生的错误如下：

Job Submission failed with exception:
org.apache.hadoop.security.AccessControlException(Permission denied:user=username, access=WRITE, inode="/user":hdfs:hadoop:drwxrwxr-x\n\tat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:234)

Why would hive attempt to write to /users ? 为什么hive会尝试写入/users ？ Additionally, why would a select statement like this need an output location at all? 另外，为什么像这样的select语句需要一个输出位置呢？

Answer 1

Hive is a SQL frontend to MapReduce and so needs to compile and stage Java code for execution. Hive是MapReduce的SQL前端，因此需要编译和分阶段执行Java代码。 It's not trying to put output there but rather the program it will execute. 它不是试图将输出放在那里，而是它将执行的程序。 Depending on your version of Hadoop this is controlled by the variables: 根据您的Hadoop版本，这由变量控制：

mapreduce.jobtracker.staging.root.dir

And on YARN / Hadoop 2: 在YARN / Hadoop 2上：

yarn.app.mapreduce.am.staging-dir

These are set in mapred-site.xml. 这些是在mapred-site.xml中设置的。

Your runner needs to be authenticated to the cluster and have a writable directory it can use. 您的运行器需要通过身份验证到群集并具有可以使用的可写目录。

为什么hive试图在hdfs中写入/ user？

问题描述

1 个解决方案

解决方案1
2 已采纳 2014-02-14 16:47:53

为什么hive试图在hdfs中写入/ user？

问题描述

1 个解决方案

解决方案1 2 已采纳 2014-02-14 16:47:53

解决方案1
2 已采纳 2014-02-14 16:47:53