简体   繁体   English

jupyter notebook 单元的执行顺序

[英]Execution order of jupyter notebook cells

I am doing a new data analytic project with jupyter notebook, and I am confused about the order of notebook cells.我正在用 jupyter notebook 做一个新的数据分析项目,我对 notebook 单元格的顺序感到困惑。

I firstly import pandas and read the csv file as data, so my first cell looks like:我首先导入熊猫并将 csv 文件作为数据读取,所以我的第一个单元格看起来像:

In [1]:

import pandas as pd
data = pd.read_csv('thanksgiving.csv', encoding='Latin-1')
print(data.head(5))

The I wanna print out the column names of the dataframe:我想打印出数据框的列名:

In [2]:
data.columns

Then I realize that in the first cell, I should use data.head(5) instead of print(data.head(5)), because the print function doesn't print the dataframe in proper format.然后我意识到在第一个单元格中,我应该使用 data.head(5) 而不是 print(data.head(5)),因为打印函数不会以正确的格式打印数据帧。

So I go back to the 1st cell, modify and execute it again.所以我回到第一个单元格,修改并再次执行它。 Then it changes from: In [1] to In [3].然后它从:In [1] 变为 In [3]。 The 2 cells now looks like: 2 个单元格现在看起来像:

In [3]: ......
In [2]: ......

More specifically, the order of cells messed up.更具体地说,单元格的顺序搞砸了。 I am afraid this will confuse the readers of my project.恐怕这会使我的项目的读者感到困惑。 Is there a well accepted regulation on this issue?在这个问题上是否有公认的规定? Or I just have to pay extra attention to avoid re-run the cells in the beginning?或者我只需要特别注意避免在开始时重新运行单元格?

Jupyter notebooks work like this only. Jupyter笔记本只能这样工作。

If you have modified any cell in the notebook, then you have to re-run its succeeding cells also. 如果您在笔记本中修改了任何单元格,则还必须重新运行其后续单元格。 And that would make the cells in ascending order again. 那样会使细胞再次以升序排列。

In your example, when the cells look like this, cell with number 2 should be executed again due to some changes in some preceding cell. 在您的示例中,当单元格看起来像这样时,由于前面的某些单元格发生了一些更改,因此应再次执行编号为2的单元格。

In [3]: ......
In [2]: ......

After you run cell 2, then the notebook will look like this. 运行单元2后,笔记本将如下所示。

In [3]: ......
In [4]: ......

Please always re-run your notebooks from top-to-bottom before sharing.在共享之前,请始终从上到下重新运行您的笔记本。 Make this a rule to live by.将此作为生活的规则。 Because even if you re-run a few cells in order, there still may be unknown changes that occurred.因为即使您按顺序重新运行几个单元格,仍然可能发生未知的变化。

If we have如果我们有

In [1]: ......
In [47]: ......
In [46]: ......
In [4]: ......

It doesn't matter if I re-run 46 and 47 to be "in order".如果我重新运行 46 和 47 以“按顺序”运行并不重要。 There's still 42 operations (unknown cell executions) between execution 4 and 46!在执行 4 和 46 之间还有 42 个操作(未知的单元执行)! It's therefore impossible for others to understand what happened because it was possible to change the code of that cell.因此,其他人不可能理解发生了什么,因为可以更改该单元格的代码。 Therefore, you will save yourself some headache if you do re-run before sharing.因此,如果您在共享之前重新运行,您将省去一些麻烦。

In [1]: ......
In [2]: ......
In [3]: ......
In [4]: ......

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM