简体   繁体   English

使用python从Excel文件自动填充word.docx

[英]Using python to autofill a word.docx from excel file

I'm about halfway through Automate the Boring Stuff with Python textbook and video tutorials, however I have a big project at work where I need to autopopulate 60 Chemical Purchase Review documents that we can't seem to find. 我正在使用Python教科书和视频教程自动完成无聊的工作的一半,但是我有一个大项目正在工作,我需要自动填充60份我们似乎找不到的Chemical Purchase Review文档。 Rather than fill them out individually, I'd like to use what I've learned so far. 与其单独填写,不如使用到目前为止所学的知识。 I've had to jump ahead in chapters, but I can't seem to figure out how to get past the last line of code. 我不得不在各章中先行一步,但是我似乎无法弄清楚如何超越最后一行代码。 Basically, I have an excel spreadsheet with four columns of information I need to be input into certain areas on the word document form template. 基本上,我有一个具有四列信息的excel电子表格,需要将这些信息输入到Word文档表单模板上的某些区域。 I have "AAAA, BBBB..." in the word doc as a something to be found and replaced. 我在doc一词中有“ AAAA,BBBB ...”,可以找到并替换。

import openpyxl,os,docx,re

os.chdir(r'C:\Users\MYUSERNAME\OneDrive\Documents\Programming\ChemInv')

wb = openpyxl.load_workbook('cheminv.xlsx')
sheet = wb.get_sheet_by_name('Sheet1')
doc = docx.Document('ChemPurchaseForm_.docx')
fillObj = ('AAAA','BBBB','CCCC','DDDD')

for a in range(1,61):
    for b in range(1,5):
        fill = sheet.cell(row=a,column=b).value
        for x in range(len(fillObj)):
            inputRegex = re.compile(fillObj[x])
            inputRegex.sub(fill,doc)

        doc.save('ChemPurcaseForm_' + fill + '.docx')   

I'm getting this error: 我收到此错误:

Traceback (most recent call last):
    File "C:/Users/MYUSERNAME/OneDrive/Documents/Programming/ChemInv/autofill.py", line 
15, in <module>
    inputRegex.sub(fill,doc)
TypeError: expected string or bytes-like object

I'm assuming that either the "fill" variable or "doc" variable are not binary or string values? 我假设“ fill”变量或“ doc”变量不是二进制值还是字符串值?

Thank you in advance for help! 预先感谢您的帮助!

To debug this, you'll need to figure out which of the values are not binary or string values. 要对此进行调试,您需要确定哪些值不是二进制值或字符串值。 A convenient way is to begin adding print statements for each value. 一种方便的方法是开始为每个值添加打印语句。 For instance, you might try 例如,您可以尝试

print(fill)
print(doc)
print(type(fill))
print(type(doc))

I don't know exactly how the docx module works, but two hypotheses occur to me: 我不知道docx模块的工作原理,但我有两个假设:

  1. doc is not the appropriate type for the sub function; doc不是该sub的适当类型; you'll have to cast the object to something different, or access it a different way if that's the case. 您必须将对象转换为其他对象,或者在这种情况下以其他方式访问它。
  2. fill is None . fillNone That's easier to fix, it means you're not reading the Excel document properly. 修复起来更容易,这意味着您没有正确阅读Excel文档。

Reading the docx documentation, I lean towards 1, since it doesn't look like it's a byte or string object, or a byte or string-compatible object, and so the sub method won't be able to properly operate on it; 在阅读docx文档时,我倾向于1,因为它看起来像不是字节或字符串对象,或者字节或字符串兼容的对象,因此sub方法将无法对其进行适当的操作; if that's correct, read the python-docx docs for more details that might help you figure out what you need to do. 如果是正确的话,请阅读python-docx文档,以获取更多详细信息,这些信息可能有助于您确定需要做什么。 I'd explore what properties exist on your document, it seems there are some for directly accessing the text. 我将研究您的文档中存在哪些属性,似乎有些属性可以直接访问文本。

Good luck! 祝好运!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM