简体   繁体   English

如何让XLRD读取XLSX单元格中的超链接?

[英]How to make XLRD read hyperlinks in XLSX cells?

This is not a duplicate although the issue has been raised in this forum in 2011 Getting a hyperlink URL from an Excel document , 2013 Extracting Hyperlinks From Excel (.xlsx) with Python and 2014 Getting the URL from Excel Sheet Hyper links in Python with xlrd ; 虽然2011年在这个论坛中提出了这个问题,但是这个问题并不重复。 从Excel文档中获取超链接URL ,2013年使用Python和2014 从Excel(.xlsx)中提取超链接从Excel中 获取URL使用xlrd进行Python中的超链接 ; there is still no answer. 仍然没有答案。 After some deep dive into the xlrd module, it seems the Data_sheet.hyperlink_map.get((row, col)) item trips because "xlrd cannot read the hyperlink without formatting_info, which is currently not supported for xlsx" per @alecxe at Extracting Hyperlinks From Excel (.xlsx) with Python . 深入研究xlrd模块之后,似乎Data_sheet.hyperlink_map.get((row,col))项目因为“xlrd无法读取没有formatted_info的超链接,xlsx目前不支持”,每个@alecxe在提取超链接时从Excel(.xlsx)到Python Question: has anyone has made progress with extracting URLs from hyperlinks stored in an excel file. 问题:有人在从excel文件中存储的超链接中提取URL方面取得了进展。 Say, of all the customer data, there is a column of hyperlinks. 比如说,在所有客户数据中,都有一列超链接。 I was toying with the idea of dumping the excel sheet as an html page and proceed per usual scraping (file on local drive). 我正在试图将excel表作为html页面转储,并按照常规方法进行处理(本地驱动器上的文件)。 But that's not a production solution. 但那不是生产解决方案。 Supplementary: is there any other module that can extract the url from a .cell(row,col).value() call on the hyperlink-cell. 补充:是否有任何其他模块可以从超链接单元格上的.cell(row,col).value()调用中提取url。 Is there a solution in mechanize? 机械化有解决方案吗? Many thanks. 非常感谢。

I had the same problem trying to get the hyperlinks from the cells of a xlsx file. 我在尝试从xlsx文件的单元格中获取超链接时遇到了同样的问题。 The work around I came up with is simply converting the Excel sheet to xls format, from which I could manage to get the hyperlinks withount any trouble, and once finished the editing, I formatted it back to the original xlsx file. 我想出的工作就是简单地将Excel工作表转换为xls格式,从中可以设置任何麻烦的超链接,一旦完成编辑,我将其格式化回原始的xlsx文件。

I don't know if this should work for your specific needs, or if the change of format implies some consecuences I am not aware of, but I think it's worth a try. 我不知道这是否适合您的具体需求,或者格式的变化是否意味着我不知道的一些因素,但我认为值得一试。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM