简体   繁体   English

如何从Python中的xlsx文件获取信息?

[英]How to get information from an xlsx file in Python?

I have to create a mailing list of people belonging to a certain institution. 我必须创建属于某个机构的人员的邮件列表。 The information is only available in .xlsx file. 该信息仅在.xlsx文件中可用。 The columns of xlsx are as follows: institution, DOB, Program, ..., EmailID. xlsx的列如下:机构,DOB,程序,...,EmailID。 How do I do this, instead of reading each entry myself and then typing the email into Google contacts? 我该如何做,而不是自己阅读每个条目,然后在Google联系人中键入电子邮件?

I know I am asking a lot, especially since I have no idea how to operate Google sheets; 我知道我要问很多,特别是因为我不知道如何操作Google表格。 like I am not sure if there is a way in Google sheets to do that. 就像我不确定Google表格中是否有办法做到这一点。 All I need are some directions. 我所需要的只是一些指示。

You can read/write .xlsx files using openpyxl. 您可以使用openpyxl读取/写入.xlsx文件。 Here is the link to the documentation . 这是文档的链接。

You can read from the .xlsx as follows: 您可以从.xlsx中读取,如下所示:

from openpyxl import load_workbook
wb2 = load_workbook('email_contacts.xlsx')
print wb2.get_sheet_names()

To add the details into Google Contacts you can use the Google Contacts API. 要将详细信息添加到Google联系人中,您可以使用Google联系人API。 Just read the official documentation on how to use the API. 只需阅读有关如何使用API​​的官方文档即可。

Edit: The openpyxl mentioned in the other answer appears to be better. 编辑:在另一个答案中提到的openpyxl似乎更好。

The easiest method is to save the file in XLS format (97-2003 format) and then use the XLRD module to parse the file. 最简单的方法是将文件保存为XLS格式(97-2003格式),然后使用XLRD模块解析文件。 To take care of files not already in this format, you can open the file in excel and then save into the correct format: 要处理尚未采用这种格式的文件,可以在excel中打开文件,然后保存为正确的格式:

xlsx_files = glob.glob('*.xlsx') 

if len(xlsx_files) != 0:
    xlApp = win32com.client.Dispatch('Excel.Application') 

    xlApp.DisplayAlerts = False
    for file in xlsx_files: 
        xlWb = xlApp.Workbooks.Open(os.path.join(os.getcwd(), file)) 
        xlWb.SaveAs(os.path.join(os.getcwd(), file.split('.xlsx')[0] + 
    '.xls'), FileFormat=1) 
        xlWb.Close()

    for file in xlsx_files: 
        os.unlink(file) 

To then access the sheet with xlrd: 然后使用xlrd访问工作表:

wb = xlrd.open_workbook(file)
#First sheet:
sh = wb.sheet_by_name(wb.sheet_names()[0])
#Select a column, columns start at 0:
pl_id_column = sh.col_values(0)
#Iterate through the rows:
for rownum in range(12,sh.nrows):
    print pl_id_column[rownum]

It is the easiest to run excel and save the 'xlsx' file as 'csv' file. 运行excel并将“ xlsx”文件另存为“ csv”文件是最简单的。 Then it is all ASCII and easy to print out the one column 'EmailID' if that's the only column you want. 然后,这就是全部ASCII格式,并且如果您只想打印一列“ EmailID”,则很容易。

To add the details into Google Contacts you can use the Google Contacts API. 要将详细信息添加到Google联系人中,您可以使用Google联系人API。 Just read the official documentation on how to use the API. 只需阅读有关如何使用API​​的官方文档即可。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM