简体   繁体   中英

Excel file in Python with XLRD

I'm looking to load an excel file (xlsx) which is about 35MB and has close to 100k rows of data into a sqlite database for some research. The file has about 40 columns and I might want to selectively load columns into a sqlite DB.

I'm approaching this as a straightforward XLRD and load to Sqlite problem. Is there a better way of doing this - such as using a different module?

And given the volume of data, would the SQLite manager plugin for Firefox be the right utility to view some of the data?

I would definitely use pandas for such operations. It has implemented wrappers for many data inputs (including excel). It is based on numpy and features a bunch of statistical methods to apply on your data. You can easily select the columns of your choice, and then directly store them in any database.

Its main data structure is called dataframe .

An example of the code you could use to load, and store data.

import pandas as pd
dataframe = pd.read_excel(YOUR_FILE)
dataframe.to_sql(*args)

You could use a python command line tool like this one:

https://pypi.org/project/xlsx2sqlite/

Then once you have your data imported you can view it using sqlitemanager or dbeaver database tools.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM