简体   繁体   English

如何将spark-submit的整个输出重定向到文件

[英]How to redirect entire output of spark-submit to a file

So, I am trying to redirect the output of an apache spark-submit command to text file but some output fails to populate file. 所以,我试图将apache spark-submit命令的输出重定向到文本文件,但是某些输出无法填充文件。 Here is the command I am using: 这是我正在使用的命令:

spark-submit something.py > results.txt

I can see the output in the terminal but I do not see it in the file. 我可以在终端看到输出,但我没有在文件中看到它。 What am I forgetting or doing wrong here? 我在这里忘记或做错了什么?

Edit: 编辑:

If I use 如果我使用

spark-submit something.py | less

I can see all the output being piped into less 我可以看到所有的输出被输入less

spark-submit prints most of it's output to STDERR spark-submit大部分输出打印到STDERR

To redirect the entire output to one file, you can use: 要将整个输出重定向到一个文件,您可以使用:

spark-submit something.py > results.txt 2>&1

Or 要么

spark-submit something.py &> results.txt

If you are running the spark-submit on a cluster the logs are stored with the application Id. 如果您在群集上运行spark-submit,则日志将与应用程序ID一起存储。 You can see the logs once the application finishes. 应用程序完成后,您可以看到日志。

yarn logs --applicationId <your applicationId> > myfile.txt

Should fetch you the log of your job 应该取你的工作日志

The applicationId of your job is given when you submit the spark job. 提交spark作业时会给出作业的applicationId。 You will be able to see that in the console where you are submitting or from the Hadoop UI. 您将能够在提交的控制台或Hadoop UI中看到它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM