简体   繁体   English

Hive在Python脚本中调用时尝试读取当前工作目录

[英]Hive trying to read current working directory when called in Python script

I am attempting to execute a Hive script from a Python wrapper. 我正在尝试从Python包装器执行Hive脚本。 Part of code looks like 部分代码看起来像

print(HiveArgs)
Hive = subprocess.Popen(HiveArgs, stderr=subprocess.PIPE, stdout=subprocess.PIPE)
HiveOutput = Hive.communicate()

print("Out:" + HiveOutput[0])
print("=================================")
print("Err:" + HiveOutput[1])

The output of this is: 输出为:

['hive', '-i ', '/edw/edwdev/tmp/spark.txn.init.tmp', '-f ', '/edw/edwdev/tmp/test.hql.tmp']
Out:
=================================
Err:
Logging initialized using configuration in file:/etc/hive/2.5.0.2-3/0/hive-log4j.properties
Exception in thread "main" java.io.FileNotFoundException: File file:/data/edw/edwdev/  does not exist
        at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:624)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:850)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:614)
        at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:422)
        at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:146)
        at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:348)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:782)
        at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:427)
        at org.apache.hadoop.hive.cli.CliDriver.processInitFiles(CliDriver.java:439)
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:708)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:624)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:148)

where /data/edw/edwdev/ (file that Hive thinks is missing) is my working directory on a Linux server. /data/edw/edwdev/ (Hive认为缺少的文件)是我在Linux服务器上的工作目录。

Changing working directory to script's location doesn't help. 将工作目录更改为脚本的位置无济于事。 Using relative and absolute paths also makes no difference. 使用相对路径和绝对路径也没有区别。 If i copy values from the printed HiveArgs and execute the command from terminal ( hive -i /edw/edwdev/tmp/spark.txn.init.tmp -f /edw/edwdev/tmp/test.hql.tmp ), it works correctly. 如果我从打印的HiveArgs复制值并从终端执行命令( hive -i /edw/edwdev/tmp/spark.txn.init.tmp -f /edw/edwdev/tmp/test.hql.tmp )正确地。

What am I missing here? 我在这里想念什么?

Turned out that the issue was with Hive arguments. 原来,问题在于Hive的争论。 print(HiveArgs) line gave output: print(HiveArgs)行给出了输出:

['hive', '-i ', '/edw/edwdev/tmp/spark.txn.init.tmp', '-f ', '/edw/edwdev/tmp/test.hql.tmp']

The arguments passed are '-f ' and '-i ' (with trailing spaces) instead of '-f' and '-i' . 传递的参数是'-f ''-i ' (带有尾部空格),而不是'-f''-i'

I am not sure what caused the issue within Hive leading it to read current working directory as some input file. 我不确定是什么原因导致Hive内的问题导致它将当前工作目录读取为某些输入文件。 Most likely Hive does not trim the arguments leading to this behavior. Hive最有可能不会整理导致这种现象的参数。 Removing the spaces fixed the issue. 删除空格可解决此问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python:将脚本的工作目录更改为其他目录以读取常量 - Python: Change the script's working directory to a different directory to read constants bash脚本在调用表单python脚本时不读取行 - bash script does not read lines when called form python script Spyder/Python - 自动将工作目录设置为当前脚本的 - Spyder/Python - Automatically set working directory to current script's 为什么 Python 在打印回溯时从当前目录读取? - Why does Python read from the current directory when printing a traceback? Python 脚本在当前目录中有效,但在指定时无效 - Python script works in current directory but not when one is specified 当前工作目录在python中不匹配 - current working directory mismatch in python 在python中更改当前工作目录 - change current working directory in python 从具有运行脚本的目录中读取文件,而不是从调用脚本的位置调用目录位置python - read files from directory that has running script rather than calling directory location from where the script was called, python 当任务计划程序调用时,Python脚本无法正常工作 - Python script not working as intended when called by task scheduler 为什么从 MATLAB 调用此 python 脚本时无法正常工作? - Why is this python script not working properly when called from MATLAB?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM