简体   繁体   中英

spreadsheet to python dictionary conversion

I am working on python and I want to read an *.ods file and convert it to a python dictionary.

The key will be the first column value and the value will be second column value.

How can I do it? I used xlrd but it does not read *.ods files.

Some available options:

  • pyexcel-ods : " A wrapper library to read, manipulate and write data in ods format. " Can be installed via: pip install pyexcel-ods . I personally recommend this package as I've used it and it is being actively maintained.

  • py-odftools : " ... a collection of tools for analyzing, converting and creating files in the ISO standard OpenDocument format. " This project hasn't been updated since late 2007. It looks abandoned.

  • ezodf : " A Python package to create/manipulate OpenDocumentFormat files. " Installable via pip install ezodf . See caveat in the comments below about a serious issue with this package.

Although you could ask your users to File>Save As (as you probably know), this might not be useful in your situation.

It's probably easier to use the libre/openoffice service. It can be run completely headless on a server without needing X11 installed or running, and that will give you a clean native conversion.

libreoffice --without-x --convert-to csv  filename.ods

Check libreoffice --help (or openoffice --help) for details. This could also be wrapped in os.system(), subprocess.*(), etc. (Note: use -convert-to on Windows.) Also note: you cannot already be running any instances of Libre/Open/Star office, including the quickstarter.

Update: prior versions of LibreOffice used --headless instead of --without-x.

Can you convert the .ODS to a csv first? Then parsing CSV using Python is pretty easy using the csv module.

检查py-odftools

This approach from the link below works awesomely for me reading/loading *.ods files into python dataframe. You can choose to load by sheet index or by sheet name .

Peeped my solution from this project: https://pypi.org/project/pandas-ods-reader/

You might first need to install these dependencies: ezodf,lxml and pandas before continuing.

pip install pandas_ods_reader

from pandas_ods_reader import read_ods

Then:

filepath = "path/to/your/file.ods"

Doing loading of sheets based on indices (index 1 based)

sheet_idx = 1
df = read_ods(filepath, sheet_idx)

Doing loading of sheets based on sheet names

sheet_name = "sales_year_1"

df = read_ods(filepath, sheet_name)

Done.

There's a great article on Linux Journal how to read ods in python. Ods file is a juz zip file containing xml file inside. You can than parse xml file to read all cells.

http://www.linuxjournal.com/article/9347?page=0,2

Using Odio you can do:

import odio

with open('test.ods', 'rb') as f:
    sheet = odio.parse_spreadsheet(f)

table = sheet.tables[0]
print(table.name)

for row in table.rows:
    print(row)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM