简体   繁体   English

将数据帧从一个 Jupyter Notebook 导入另一个 Jupyter Notebook

[英]Importing a Dataframe from one Jupyter Notebook into another Jupyter Notebook

I wrote a python script to get data from my Gmail account which I imported as a pandas dataframe into a Jupyter notebook.我编写了一个 python 脚本来从我的 Gmail 帐户中获取数据,我将其作为 Pandas 数据框导入到 Jupyter 笔记本中。 The notebook is called "Automation via Gmail API" and the dataframe is simply called "df".笔记本称为“通过 Gmail API 自动化”,数据框简称为“df”。 Now I want to use this df to update a Google Sheet via the Google Sheets API.现在我想使用这个 df 通过 Google Sheets API 更新 Google Sheet。 To this end I created another notebook - "Automation via Sheets API".为此,我创建了另一个笔记本 - “通过 Sheets API 实现自动化”。 But how can I access df in the "Automation via Sheets API" notebook?但是如何在“通过 Sheets API 自动化”笔记本中访问 df ? Apparently, Jupyter provides some functionality to load a notebook into another notebook .显然, Jupyter 提供了一些将笔记本加载到另一个笔记本的功能 I simply copy and pasted the code of the "Notebook Loader" into my Sheets-notebook and only changed "path" and "fullname", but it doesn't work and I don't have a clue why:我只是将“Notebook Loader”的代码复制并粘贴到我的 Sheets-notebook 中,只更改了“path”和“fullname”,但它不起作用,我不知道为什么:

#Load df from the "Automation via Gmail API" notebook.

fullname = "Automation via Gmail API.ipynb"

class NotebookLoader(object):
    """Module Loader for Jupyter Notebooks"""
    def __init__(self, path="C:\\Users\\Moritz Wolff\\Desktop\\gmail automatisierung\\Gmail API"):
        self.shell = InteractiveShell.instance()
        self.path = path

    def load_module(self, fullname="Automation via Gmail API.ipynb"):
        """import a notebook as a module"""
        path = find_notebook(fullname, self.path)

[...]

There is no error-message.没有错误信息。 Is my strategy flawed from the start or do I simply miss a little detail?我的策略是从一开始就有缺陷,还是我只是遗漏了一些细节? Any help is appreciated.任何帮助表示赞赏。

A direct option is to save the dataframe as a text table in the original notebook and read it into the other.一个直接的选择是将数据框保存为原始笔记本中的文本表,然后将其读入另一个。 Instead of plain text you can also save the dataframe itself as serialized Python for a little more efficiency/convenience.除了纯文本,您还可以将数据帧本身保存为序列化的 Python,以提高效率/便利性。

Options from source notebook:源笔记本中的选项:

df.to_csv('example.tsv', sep='\t') # add `, index = False` to leave off index
# -OR-
df.to_pickle("file_name.pkl")

Options in reading notebook:阅读笔记本的选项:

import pandas as pd
df = pd.read_csv('example.tsv', sep='\t')
#-OR-
df = pd.read_pickle("file_name.pkl")

I used tab delimited tabular text structure, but you are welcome to use comma-separated.我使用了制表符分隔的表格文本结构,但欢迎您使用逗号分隔。

I would avoid loading your notebook from another notebook unless you are sure that is how you want to approach your problem.我会避免从另一个笔记本加载您的笔记本,除非您确定这是您想要解决问题的方式。

You can always export your dataframe to a csv using pandas.DataFrame.to_csv() , then load it in your other notebook with pandas.read_csv()您可以将数据帧使用总是导出为CSV pandas.DataFrame.to_csv()然后在其他笔记本加载pandas.read_csv()

import pandas as pd

df = ['test','data']
df.to_csv('data1.csv')

Then in your other notebook:然后在你的另一个笔记本中:

df = pd.read_csv('data1.csv', index_col = 0)

Alternatively you can try using the %store magic function:或者,您可以尝试使用%store魔术功能:

df = ['test','data']

%store df

Then to recall it in another notebook to retrieve it:然后在另一个笔记本中调用它以检索它:

%store -r df

One constraint about this method is that you have to %store your data each time the variable is updated.这种方法的一个限制是每次更新变量时都必须%store数据。

Documentation: https://ipython.readthedocs.io/en/stable/config/extensions/storemagic.html文档: https : //ipython.readthedocs.io/en/stable/config/extensions/storemagic.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM