简体   繁体   English

无法使用win32com.client打开只读Microsoft Word文件

[英]Cannot open read-only Microsoft Word files with win32com.client

I have thousands of docx files of which I need to extract certain elements of via a Python. 我有数千个docx文件,我需要通过python提取其中的某些元素。 I am using a win32com.client in my Python script to accomplish this task. 我在Python脚本中使用win32com.client完成此任务。

import win32com.client
doc = win32com.client.GetObject(some_path_to_a_docx_file)

This works fine for 99% of the files; 这对于99%的文件都可以正常工作; however, for a few files a Microsoft Word dialogue-box launches: " The author would like you to open this as read-only, unless you need to make changes. Open as read only? ". 但是,对于一些文件,将启动Microsoft Word对话框:“ 作者希望您以只读方式打开它,除非您需要进行更改。以只读方式打开? ”。 The script stops at this point waiting for user input. 此时脚本停止,等待用户输入。

Pressing yes on the dialogue box continues to run the script as needed; 在对话框上按是继续运行脚本; however, this is sub-par since I need it to be fully automated without any dialogue boxes as this one popping up. 但是,这是不合格的,因为我需要它完全自动化,而不会出现任何对话框。 Is there any way to disable the MS Word prompt, either through win32com Python or alternatively permanently through MS Word? 有什么方法可以通过win32com Python或通过MS Word永久禁用MS Word提示符吗? (Note: import docx as an alternative to win32com is not an option here.) (注意:在这里,不能将import docx替换为win32com。)

The Office applications are end-user applications and not optimized as development tools. Office应用程序是最终用户应用程序,并未作为开发工具进行优化。 This means that they can appear to hang when waiting for user-input, as described in the question. 这意味着它们在等待用户输入时似乎挂起,如问题中所述。 There is no simple, clean way around this, which is why leveraging the Open XML file format is recommended for requirments where dismissing dialog boxes are a problem... 没有简单,干净的方法可以解决这个问题,这就是为什么建议在需要关闭对话框的情况下利用Open XML文件格式的原因...

If automation must be used then there are two possibilities I'm aware of. 如果必须使用自动化,那么我知道有两种可能性。 The detailed information is not python, but this does document the basic approaches. 详细信息不是python,但这确实记录了基本方法。

  1. Use a Timer function and SendKeys to dismiss a dialog box automatically if code can't progress. 如果代码无法继续执行,请使用Timer函数和SendKeys自动关闭对话框。 This is a bit of a lottery as it's not possible to know which dialog box is being dismissed. 这有点彩票,因为不可能知道哪个对话框被关闭了。 Usually, the "Escape" key is sent. 通常,发送“转义”键。 Once upon a time, there was a set of KB articles targeting various programming languages, but they're no longer available at the Microsoft site. 曾几何时,有一系列针对各种编程语言的知识库文章,但在Microsoft网站上已不再提供。 I find an archive C-Bit and am copying the relevant sample content that demonstrates the principle for classic VB6: 我找到一个档案库C-Bit,并在复制相关样本内容,以演示经典VB6的原理:

The steps in this section demonstrate Automation of Microsoft Word to print a document. 本节中的步骤演示了Microsoft Word的自动化以打印文档。 The Automation client calls the PrintOut method for the Word Document object. 自动化客户端调用Word Document对象的PrintOut方法。 If the user's default printer is configured to print to the FILE port, then a call to PrintOut produces a dialog box prompting the user to enter a file name. 如果将用户的默认打印机配置为打印到FILE端口,则对PrintOut的调用将产生一个对话框,提示用户输入文件名。 To determine if the PrintOut method causes this dialog box to appear, the Visual Basic Automation client uses a Timer control to detect idle time after calling the PrintOut method. 若要确定PrintOut方法是否导致此对话框出现,Visual Basic Automation客户端使用Timer控件在调用PrintOut方法后检测空闲时间。 Prior to calling PrintOut, the Timer is enabled and set to fire in five seconds. 在调用PrintOut之前,已启用计时器并将其设置为在五秒钟内触发。 When PrintOut completes, the Timer is disabled. 打印输出完成后,计时器将被禁用。 Therefore, if the PrintOut method completes within five seconds, the Timer event never occurs and no further action is taken. 因此,如果PrintOut方法在五秒钟内完成,则Timer事件永远不会发生,并且不会采取进一步的措施。 The document is printed and the code execution continues beyond the PrintOut method. 打印文档,并且代码执行继续超出PrintOut方法。 However, if the Timer event occurs within the five second interval, it is assumed that the PrintOut method has not completed and that the delay is caused by a dialog box waiting for user input. 但是,如果Timer事件在五秒钟的间隔内发生,则认为PrintOut方法尚未完成,并且该延迟是由对话框等待用户输入引起的。 When the Timer event occurs, the Automation client gives focus to Word and uses SendKeys to dismiss the dialog box. 发生Timer事件时,自动化客户端会将焦点放在Word上,并使用SendKeys关闭该对话框。

Note For demonstration purposes, this sample uses the PrintOut method in such a way that it displays a dialog box intentionally when it prints to a printer set to a FILE port. 注意:出于演示目的,此示例使用PrintOut方法,以便在打印到设置为FILE端口的打印机时有意显示对话框。 Please note that the PrintOut method has two arguments, OutputfileName and PrintToFile, that you can provide to avoid this dialog box. 请注意,可以使用PrintOut方法提供两个参数OutputfileName和PrintToFile来避免出现此对话框。

Additionally, when using this "timer" approach, you can customize the wait time to be greater or less than five seconds, as well as customize the keystrokes you send to the dialog box. 此外,使用这种“计时器”方法时,您可以将等待时间自定义为大于或小于5秒,以及自定义发送到对话框的击键。

This demonstration consists of two Visual Basic projects: 该演示包含两个Visual Basic项目:

  1. An ActiveX EXE that provides a Timer class used to detect a delay. 提供用于检测延迟的Timer类的ActiveX EXE。 The reason to use an ActiveX EXE for the Timer class is to run the Timer code in a separate process and, therefore, a separate thread. 将ActiveX EXE用于Timer类的原因是在单独的进程(因此,在单独的线程)中运行Timer代码。 This makes it possible for the Timer class to raise an event during a suspended automation call. 这使Timer类可以在暂停的自动化调用期间引发事件。

  2. A Standard EXE that uses automation to Word and calls the PrintOut method to print a document. 一个标准EXE,它使用Word的自动化功能,并调用PrintOut方法来打印文档。 It uses the ActiveX EXE to detect a delay when calling the PrintOut method. 调用PrintOut方法时,它使用ActiveX EXE来检测延迟。 Create the ActiveX EXE Project 创建ActiveX EXE项目

  3. Start Visual Basic and create an ActiveX EXE project. 启动Visual Basic并创建一个ActiveX EXE项目。 Class1 is created by default. 默认情况下创建Class1。
  4. On the Project menu, click to select Properties, and then change the Project name to MyTimer. 在项目菜单上,单击以选择属性,然后将项目名称更改为MyTimer。
  5. Copy and paste the following code into the Class1 module:Option Explicit 将以下代码复制并粘贴到Class1模块中:Option Explicit
 Public Event Timer() Private oForm1 As Form1

 Private Sub Class_Initialize()
     Set oForm1 = New Form1
     oForm1.Timer1.Enabled = False 
End Sub

 Private Sub Class_Terminate()
     Me.Enabled = False
     Unload oForm1
     Set oForm1 = Nothing 
End Sub

Public Property Get Enabled() As Boolean
     Enabled = oForm1.Timer1.Enabled 
End Property

Public Property Let Enabled(ByVal vNewValue As Boolean)
     oForm1.Timer1.Enabled = vNewValue
     If vNewValue = True Then
         Set oForm1.oClass1 = Me
     Else
         Set oForm1.oClass1 = Nothing
     End If 
End Property

 Public Property Get Interval() As Integer
     Interval = oForm1.Timer1.Interval 
End Property

 Public Property Let Interval(ByVal vNewValue As Integer)
     oForm1.Timer1.Interval = vNewValue End Property

 Friend Sub TimerEvent()
     RaiseEvent Timer 
End Sub                 
  1. On the Project menu, choose Add Form to add a new Form to the project. 在项目菜单上,选择添加表单以将新表单添加到项目。
  2. Add a Timer control to the form. 将计时器控件添加到窗体。
  3. Copy and paste the following code into the code module for Form1:Option Explicit 将以下代码复制并粘贴到Form1的代码模块中:Option Explicit
 Public oClass1 As Class1

 Private Sub Timer1_Timer()
     oClass1.TimerEvent 
End Sub
  1. Save this project in a new subfolder named Server. 将此项目保存在一个名为Server的新子文件夹中。
  2. On the File menu, choose Make MyTimer.Exe to build and register the component. 在“文件”菜单上,选择“制作MyTimer.Exe”以生成并注册该组件。 Create the Automation Client 创建自动化客户端
  1. Work with the Windows API to identify and dismiss likely problem dialog boxes. 与Windows API一起使用,以识别和消除可能的问题对话框。 I found some code for this on the MSDN forum which I'm copying here. 我在这里复制的MSDN论坛上找到了一些与此相关的代码。 The attribution is to the user name yet : 归属是用户名yet

Here is a sample that uses the Win32 API through pinvoke in C#. 这是通过C#中的pinvoke使用Win32 API的示例。 I am able to dispose of known Word windows like Word->File->Options dialog window through FindWindow and SendMessage or PostMessage. 我可以通过FindWindow和SendMessage或PostMessage处理已知的Word窗口,例如Word-> File-> Options对话框窗口。 Please go through the sample and see whether it will work for you. 请仔细阅读示例,看看它是否对您有用。 Since you know which dialog boxes you want to dispose off, please use spy++ to find the window caption and window class and use it in this sample. 由于您知道要丢弃的对话框,因此请使用spy ++查找窗口标题和窗口类,并在此示例中使用它。

For your scenario SendKeys may not be necessary. 对于您的方案,可能不需要SendKeys。

Hope this helps. 希望这可以帮助。

using System;
 using System.Collections.Generic;
 using System.ComponentModel;
 using System.Data;
 using System.Drawing;
 using System.Linq;
 using System.Text;
 using System.Windows.Forms;

using System.Runtime.InteropServices;

namespace SendKeys
 {

    public partial class Form1 : Form
     {
         // For Windows Mobile, replace user32.dll with coredll.dll

         [DllImport("user32.dll", SetLastError = true)]
         static extern IntPtr FindWindow(string lpClassName, string lpWindowName);

        // Find window by Caption only. Note you must pass IntPtr.Zero as the first parameter.

        [DllImport("user32.dll", EntryPoint = "FindWindow", SetLastError = true)]
         static extern IntPtr FindWindowByCaption(IntPtr ZeroOnly, string lpWindowName);

        [DllImport("user32.dll")]
         public static extern bool SetForegroundWindow(IntPtr hWnd);


         [return: MarshalAs(UnmanagedType.Bool)]
         [DllImport("user32.dll", SetLastError = true)]
         static extern bool PostMessage(IntPtr hWnd, UInt32 Msg, IntPtr wParam, IntPtr lParam);

        [DllImport("user32.dll", CharSet = CharSet.Auto)]
         static extern IntPtr SendMessage(IntPtr hWnd, UInt32 Msg, IntPtr wParam, IntPtr lParam);

        static uint WM_CLOSE = 0x10;

        public Form1()
         {
             InitializeComponent();
         }

        private void button1_Click(object sender, EventArgs e)
         {

             // the caption and the className is for the Word -> File -> Options window
             // the caption and the className are got by using spy++ application and focussing on the window we are researching.
             string caption = "Word Options";
             string className = "NUIDialog";
             IntPtr hWnd= (IntPtr)(0);

             // Win 32 API being called through PInvoke
             hWnd = FindWindow(className, caption);

            /*bool retVal = false;
             if ((int)hWnd != 0)
             {
                // Win 32 API being called through PInvoke 
               retVal = SetForegroundWindow(hWnd);
             }*/



            if ((int)hWnd != 0)
             {
                 CloseWindow2(hWnd);
                 //CloseWindow(hWnd); // either sendMessage or PostMessage can be used.
             }
         }



        static bool CloseWindow(IntPtr hWnd)
         {
             // Win 32 API being called through PInvoke
             SendMessage(hWnd, WM_CLOSE, IntPtr.Zero, IntPtr.Zero);
             return true;
         }

        static bool CloseWindow2(IntPtr hWnd)
         {
             // Win 32 API being called through PInvoke
             PostMessage(hWnd, WM_CLOSE, IntPtr.Zero, IntPtr.Zero);
             return true;

         }

    }
 }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM