简体   繁体   English

Python 和 Excel - OpenPyXL

[英]Python and Excel - OpenPyXL

I am working with Excel using Python and have couple of questions:我正在使用 Excel 使用 Python 并有几个问题:

  1. Loading Excel Sheet into 2d Array.将 Excel 表加载到二维数组中。
    In VBA I'd simply do:在 VBA 我会简单地做:
dim arrData as Variant 
arrData = shtData.Range("A1:E2500") 

I would get array(1 to 2500, 1 to 5), which i can easily access for example arrData(1,5) -> row 1 column 5我会得到数组(1 到 2500、1 到 5),我可以轻松访问它,例如 arrData(1,5) -> 第 1 行第 5 列

In Python what i managed to do is:在 Python 我设法做的是:

#declare list
excel_data=[]

#loop to load excel spreadsheet data into 2d Array   
#basically I am looping through every row and append it to list
for row in shtData.iter_rows(min_row=5, max_row=50,values_only =True):
    excel_data.append(row)
  1. Is there a way to assign row to list, starting from index 1 not 0?有没有办法将行分配给列表,从索引 1 而不是 0 开始?
    In VBA there is an option Option Base 1.在 VBA 中有一个选项 Option Base 1。
    https://docs.microsoft.com/en-us/office/vba/language/reference/user-interface-help/option-base-statement https://docs.microsoft.com/en-us/office/vba/language/reference/user-interface-help/option-base-statement

  2. Is it the fastest way to operate on Excel Dataset?它是在 Excel 数据集上操作的最快方法吗?
    I am planning then to loop through let's say 2500 rows and 5 columns -> 12'500 cells.然后我计划循环遍历 2500 行和 5 列 - > 12'500 个单元格。
    With VBA it was very efficient to be honest (operating on array in memory).说实话,使用 VBA 非常有效(在内存中的数组上操作)。

  3. As I understand, functionsof OpenPyXL:据我了解,OpenPyXL 的功能:

load_workbook       

#only creates REFERENCE TO EXCEL WORKBOOK - it does not open it? #only 创建 EXCEL 工作簿的参考 - 它不打开它? or is it "loaded" into memory but what is on the HD is actually intact?或者它是否“加载”到 memory 但 HD 上的内容实际上是完整的?

shtData = wkb.worksheets[0]                         

#again only reference? #再次只是参考?

shtReport = wkb.create_sheet(title="ReportTable")       

#it adds sheet but it adds it in excel which is loaded into memory, only after saving, Excel on HD is actually overwritten? #它添加了工作表,但它添加到excel中,它加载到memory中,只有在保存后,HD上的Excel实际上被覆盖了?

You can used Pandas and create a dataframe (2D table) from the Excel spreadsheat.您可以使用 Pandas 并从 Excel spreadsheat 创建 dataframe(二维表)。

import pandas as pd

df = pd.read_excel("data.xls")
print(df)
print("____________")
print(f'Average sales are: {df["Gross"].values.mean()}')
print(f'Net income for April: {df.at[3, "Net"]}')
print("____________")
df_no_header = pd.read_excel("data.xls",skiprows=1, header=None)
print(df_no_header)
print("____________")
print(f'Net income for April: {df_no_header.at[3, 2]}')

Output: Output:

输出

The Pandas dataframe has many methods that will allow you to access rows and columns and do much more. Pandas dataframe 有许多方法可以让您访问行和列并执行更多操作。 Setting skiprows=1, header=None will skip the header row.设置 skiprows=1, header=None 将跳过 header 行。 See here .这里

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM