简体   繁体   English

如何保存我之前创建的非常耗时的数组。 这样我就可以重用它而无需再次运行该行代码?

[英]How can I save an array that I created very timeconsumigly before. So I can reuse it without running the line of code again?

This lines of code extracts all tables from page 667-795 from a pdf and saves them into an array full of tables.这行代码从 pdf 中提取第 667-795 页的所有表格,并将它们保存到一个充满表格的数组中。

tablesSys = cam.read_pdf("840Dsl_sysvar_lists_man_0122_de-DE_wichtig.pdf",
                         pages = "667-795", 
                         process_threads = 100000, 
                         line_scale = 100, 
                         strip_text ='.\n'
                        ) 

tablesSys = np.array(tablesSys)

The array looks like this.数组看起来像这样。

在此处输入图像描述

Later I have to use this array multiple times.后来我不得不多次使用这个数组。

Now I work with jupyter lab and whenever my kernel gets offline or I start working again after hours or when I restart the kernel etc. I have to call up this line of code to get my tablesSys.现在我使用 jupyter lab 工作,每当我的 kernel 下线或者下班后我再次开始工作,或者当我重新启动 kernel 等时。我必须调用这行代码来获取我的 tableSys。 Which takes more then 11 minutes to load.加载时间超过 11 分钟。

Since the pdf doesn't change at all, I think that I could find a way to only load the code once and save the array somehow.由于 pdf 根本没有改变,我想我可以找到一种方法来只加载一次代码并以某种方式保存数组。 So in the furture I can use the array without loading the code.所以以后我可以在不加载代码的情况下使用数组。

Hope to find a solution:)))希望找到解决方案:)))

Try using the pickle format to save a pickle file to the file system https://docs.python.org/3/library/pickle.html尝试使用 pickle 格式将 pickle 文件保存到文件系统https://docs.python.org/3/library/pickle.html

See a high-level example here, I did not run this code but it should give you an idea.请参阅此处的高级示例,我没有运行这段代码,但它应该会给你一个想法。

import pickle

import numpy as np

# calculate the huge data slice
heavy_numpy_array = np.zeros((1000,2)) # some data

# decide where to store the data in the file-system
my_filename = 'path/to/my_file.xyz'
my_file = open(my_filename, 'wb')

# save to file
pickle.dump(heavy_numpy_array, my_file)
my_file.close()

# load the data from file
my_file_v2 = open(my_filename, 'wb')
my_long_numpy_array = pickle.load(my_file_v2)
my_file_v2.close()

Was playing around...一直在玩...

import numpy as np


class Cam:
    def read_pdf(self, *args, **kwargs):
        return np.random.rand(3, 2)


cam = Cam()

tablesSys = cam.read_pdf(
    "840Dsl_sysvar_lists_man_0122_de-DE_wichtig.pdf",
    pages="667-795",
    process_threads=100000,
    line_scale=100,
    strip_text=".\n",
)


with open("data.npy", "wb") as f:
    np.save(f, tablesSys)

with open("data.npy", "rb") as f:
    tablesSys = np.load(f)
print(tablesSys)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何保存在一段代码中创建的列表,以便在再次使用同一代码时可以再次使用它? - How can I save a list created in a piece of code so that I can use it again when I use the same code again? 如何使用 selenium 保存密码(登录信息),所以我不必在 Instagram 上一次又一次地登录 - how can I save password(login info) using selenium , so I have to not login again and again on Instagram 如何将其保存到变量中,以便将其保存到文件中 - How can i save this into a variable so that i can save it to a file 如何在运行下一行代码之前更新 Tkinter label 文本? - How can I update Tkinter label text before running next line of code? 如何编写代码以创建多个函数但我不必一次又一次地编写相同的代码? - How can I write the code in such a way that multiple functions get created but I don't have to write the same code again and again? 如何保存输入值以便下次运行代码 (Python) 时可以检索它? - How can I save an input value so I can retrieve it next time I run the code (Python)? 如何编辑我的 VS Code 环境,以便我可以为我的 Python 代码预设输入数据,这样我就不必一次又一次地输入数据 - How do I edit my VS Code environment so that I can preset Input data for my Python Code so that I don't have to input data again and again 如何迭代此代码以便我可以动态保存列表中最后交易的值? - How can I iterate this code so that I can save the value of last traded in the list dynamically? 我怎样才能恢复老电报中的旧会话并再次连接(不再发送代码)) - how i can restore sessions old in telethon telegram and connect this again(without send again code)) 如何正确地在Tensorflow中获得形状,以便我可以重新塑形? - How to properly get the shape in Tensorflow so that I can reshape again?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM