简体   繁体   English

如何使 matplotlib 在 AWS EMR Jupyter 笔记本中工作?

[英]How do I make matplotlib work in AWS EMR Jupyter notebook?

This is very close to this question, but I have added a few details specific to my question:这与这个问题非常接近,但我添加了一些针对我的问题的详细信息:

Matplotlib Plotting using AWS-EMR jupyter notebook 使用 AWS-EMR jupyter notebook 进行 Matplotlib 绘图

I would like to find a way to use matplotlib inside my Jupyter notebook.我想找到一种在我的 Jupyter 笔记本中使用 matplotlib 的方法。 Here is the code-snippet in error, it's fairly simple:这是错误的代码片段,它相当简单:

notebook笔记本

import matplotlib
matplotlib.use("agg")
import matplotlib.pyplot as plt
plt.plot([1,2,3,4])
plt.show()

I chose this snippet because this line alone fails as it tries to use TKinter (which is not installed on an AWS EMR cluster):我选择这个片段是因为当它尝试使用 TKinter(未安装在 AWS EMR 集群上)时,仅这一行就失败了:

import matplotlib.pyplot as plt

When I run the full notebook snippet, the result is no runtime error but also nothing happens (no graph is shown.) My understanding on one way this can work is by adding either of the following snips:当我运行完整的笔记本代码段时,结果是没有运行时错误,但也没有任何反应(没有显示图表。)我对一种可以工作的方式的理解是添加以下任一片段:

pyspark magic notation pyspark 魔术符号

%matplotlib inline

results结果

unknown magic command 'matplotlib'
UnknownMagic: unknown magic command 'matplotlib'

IPython explicit magic call IPython 显式魔术调用

from IPython import get_ipython
get_ipython().run_line_magic('matplotlib', 'inline')

results结果

'NoneType' object has no attribute 'run_line_magic'
Traceback (most recent call last):
AttributeError: 'NoneType' object has no attribute 'run_line_magic'

to my notebook which invokes a spark magic command which inlines matplotlib plots (at least that's my interpretation.) I have tried both of these after using a bootstrap action:到我的笔记本,它调用一个 spark magic 命令,该命令内联 matplotlib 图(至少这是我的解释。)我在使用引导操作后尝试了这两种方法:

EMR bootstrap EMR 引导程序

sudo pip install matplotlib
sudo pip install ipython

Even with these added, I still get an error that there is no magic for matplotlib.即使添加了这些,我仍然收到一个错误,即 matplotlib 没有魔法。 So my question is definitely:所以我的问题肯定是:

Question

How do I make matplotlib work in an AWS EMR Jupyter notebook?如何使 matplotlib 在 AWS EMR Jupyter 笔记本中工作?

(Or how do I view graphs and plot images in AWS EMR Jupyter notebook?) (或者如何在 AWS EMR Jupyter notebook 中查看图形和绘制图像?)

As you mentioned, matplotlib is not installed on the EMR cluster, therefore such error will occur:正如你所说,在EMR集群上没有安装matplotlib ,因此会出现这样的错误:

错误

However, it is actually available in the managed Jupyter notebook instance (the docker container).但是,它实际上在托管的 Jupyter 笔记本实例(docker 容器)中可用。 Using the %%local magic will allow you to run the cell locally:使用%%local魔法将允许您在本地运行单元格:

当地的

The answer by @00schneider actually works. @00schneider 的答案确实有效。

import matplotlib.pyplot as plt

# plot data here
plt.show()

after

plt.show() plt.show()

re-run the magic cell that contains the below, and you will see a plot on your AWS EMR Jupyter PySpark notebook重新运行包含以下内容的魔法单元,您将在 AWS EMR Jupyter PySpark 笔记本上看到一个图

%matplot plt

Import matplotlib as将 matplotlib 导入为

import matplotlib.pyplot as plt

and use the magic command %matplot plt instead as shown in the tutorial here: https://aws.amazon.com/de/blogs/big-data/install-python-libraries-on-a-running-cluster-with-emr-notebooks/并使用魔术命令%matplot plt代替,如此处的教程所示: https : %matplot plt emr-笔记本/

The following should work:以下应该工作:

import matplotlib
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot([1,2,3,4])

Run the entire script in one cell在一个单元格中运行整个脚本

To plot something in AWS EMR notebooks, you simply need to use %matplot plt .要在 AWS EMR 笔记本中绘制某些内容,您只需使用%matplot plt You can see this documented about midway down this page from AWS .您可以在 AWS 的此页面中间看到此文档。

For example, if I wanted to make a quick plot:例如,如果我想快速绘制一个图:

import matplotlib.pyplot as plt

plt.clf() #clears previous plot in EMR memory
plt.plot([1,2,3,4])
plt.show()

%matplot plt

Try below code.试试下面的代码。 FYI we have matplotlib 3.1.1 installed in Python3.6 on emr-5.26.0 and i used PySpark Kernel.仅供参考,我们在 emr-5.26.0 上的 Python3.6 中安装了 matplotlib 3.1.1,我使用了 PySpark 内核。 Make sure that "%matplotlib inline" is first line in cell确保“%matplotlib inline”是单元格中的第一行

%matplotlib inline

import matplotlib
import matplotlib.pyplot as plt
plt.plot([1,2,3,4])
plt.show()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM