简体   繁体   中英

Cannot open read-only Microsoft Word files with win32com.client

I have thousands of docx files of which I need to extract certain elements of via a Python. I am using a win32com.client in my Python script to accomplish this task.

import win32com.client
doc = win32com.client.GetObject(some_path_to_a_docx_file)

This works fine for 99% of the files; however, for a few files a Microsoft Word dialogue-box launches: " The author would like you to open this as read-only, unless you need to make changes. Open as read only? ". The script stops at this point waiting for user input.

Pressing yes on the dialogue box continues to run the script as needed; however, this is sub-par since I need it to be fully automated without any dialogue boxes as this one popping up. Is there any way to disable the MS Word prompt, either through win32com Python or alternatively permanently through MS Word? (Note: import docx as an alternative to win32com is not an option here.)

The Office applications are end-user applications and not optimized as development tools. This means that they can appear to hang when waiting for user-input, as described in the question. There is no simple, clean way around this, which is why leveraging the Open XML file format is recommended for requirments where dismissing dialog boxes are a problem...

If automation must be used then there are two possibilities I'm aware of. The detailed information is not python, but this does document the basic approaches.

  1. Use a Timer function and SendKeys to dismiss a dialog box automatically if code can't progress. This is a bit of a lottery as it's not possible to know which dialog box is being dismissed. Usually, the "Escape" key is sent. Once upon a time, there was a set of KB articles targeting various programming languages, but they're no longer available at the Microsoft site. I find an archive C-Bit and am copying the relevant sample content that demonstrates the principle for classic VB6:

The steps in this section demonstrate Automation of Microsoft Word to print a document. The Automation client calls the PrintOut method for the Word Document object. If the user's default printer is configured to print to the FILE port, then a call to PrintOut produces a dialog box prompting the user to enter a file name. To determine if the PrintOut method causes this dialog box to appear, the Visual Basic Automation client uses a Timer control to detect idle time after calling the PrintOut method. Prior to calling PrintOut, the Timer is enabled and set to fire in five seconds. When PrintOut completes, the Timer is disabled. Therefore, if the PrintOut method completes within five seconds, the Timer event never occurs and no further action is taken. The document is printed and the code execution continues beyond the PrintOut method. However, if the Timer event occurs within the five second interval, it is assumed that the PrintOut method has not completed and that the delay is caused by a dialog box waiting for user input. When the Timer event occurs, the Automation client gives focus to Word and uses SendKeys to dismiss the dialog box.

Note For demonstration purposes, this sample uses the PrintOut method in such a way that it displays a dialog box intentionally when it prints to a printer set to a FILE port. Please note that the PrintOut method has two arguments, OutputfileName and PrintToFile, that you can provide to avoid this dialog box.

Additionally, when using this "timer" approach, you can customize the wait time to be greater or less than five seconds, as well as customize the keystrokes you send to the dialog box.

This demonstration consists of two Visual Basic projects:

  1. An ActiveX EXE that provides a Timer class used to detect a delay. The reason to use an ActiveX EXE for the Timer class is to run the Timer code in a separate process and, therefore, a separate thread. This makes it possible for the Timer class to raise an event during a suspended automation call.

  2. A Standard EXE that uses automation to Word and calls the PrintOut method to print a document. It uses the ActiveX EXE to detect a delay when calling the PrintOut method. Create the ActiveX EXE Project

  3. Start Visual Basic and create an ActiveX EXE project. Class1 is created by default.
  4. On the Project menu, click to select Properties, and then change the Project name to MyTimer.
  5. Copy and paste the following code into the Class1 module:Option Explicit
 Public Event Timer() Private oForm1 As Form1

 Private Sub Class_Initialize()
     Set oForm1 = New Form1
     oForm1.Timer1.Enabled = False 
End Sub

 Private Sub Class_Terminate()
     Me.Enabled = False
     Unload oForm1
     Set oForm1 = Nothing 
End Sub

Public Property Get Enabled() As Boolean
     Enabled = oForm1.Timer1.Enabled 
End Property

Public Property Let Enabled(ByVal vNewValue As Boolean)
     oForm1.Timer1.Enabled = vNewValue
     If vNewValue = True Then
         Set oForm1.oClass1 = Me
     Else
         Set oForm1.oClass1 = Nothing
     End If 
End Property

 Public Property Get Interval() As Integer
     Interval = oForm1.Timer1.Interval 
End Property

 Public Property Let Interval(ByVal vNewValue As Integer)
     oForm1.Timer1.Interval = vNewValue End Property

 Friend Sub TimerEvent()
     RaiseEvent Timer 
End Sub                 
  1. On the Project menu, choose Add Form to add a new Form to the project.
  2. Add a Timer control to the form.
  3. Copy and paste the following code into the code module for Form1:Option Explicit
 Public oClass1 As Class1

 Private Sub Timer1_Timer()
     oClass1.TimerEvent 
End Sub
  1. Save this project in a new subfolder named Server.
  2. On the File menu, choose Make MyTimer.Exe to build and register the component. Create the Automation Client
  1. Work with the Windows API to identify and dismiss likely problem dialog boxes. I found some code for this on the MSDN forum which I'm copying here. The attribution is to the user name yet :

Here is a sample that uses the Win32 API through pinvoke in C#. I am able to dispose of known Word windows like Word->File->Options dialog window through FindWindow and SendMessage or PostMessage. Please go through the sample and see whether it will work for you. Since you know which dialog boxes you want to dispose off, please use spy++ to find the window caption and window class and use it in this sample.

For your scenario SendKeys may not be necessary.

Hope this helps.

using System;
 using System.Collections.Generic;
 using System.ComponentModel;
 using System.Data;
 using System.Drawing;
 using System.Linq;
 using System.Text;
 using System.Windows.Forms;

using System.Runtime.InteropServices;

namespace SendKeys
 {

    public partial class Form1 : Form
     {
         // For Windows Mobile, replace user32.dll with coredll.dll

         [DllImport("user32.dll", SetLastError = true)]
         static extern IntPtr FindWindow(string lpClassName, string lpWindowName);

        // Find window by Caption only. Note you must pass IntPtr.Zero as the first parameter.

        [DllImport("user32.dll", EntryPoint = "FindWindow", SetLastError = true)]
         static extern IntPtr FindWindowByCaption(IntPtr ZeroOnly, string lpWindowName);

        [DllImport("user32.dll")]
         public static extern bool SetForegroundWindow(IntPtr hWnd);


         [return: MarshalAs(UnmanagedType.Bool)]
         [DllImport("user32.dll", SetLastError = true)]
         static extern bool PostMessage(IntPtr hWnd, UInt32 Msg, IntPtr wParam, IntPtr lParam);

        [DllImport("user32.dll", CharSet = CharSet.Auto)]
         static extern IntPtr SendMessage(IntPtr hWnd, UInt32 Msg, IntPtr wParam, IntPtr lParam);

        static uint WM_CLOSE = 0x10;

        public Form1()
         {
             InitializeComponent();
         }

        private void button1_Click(object sender, EventArgs e)
         {

             // the caption and the className is for the Word -> File -> Options window
             // the caption and the className are got by using spy++ application and focussing on the window we are researching.
             string caption = "Word Options";
             string className = "NUIDialog";
             IntPtr hWnd= (IntPtr)(0);

             // Win 32 API being called through PInvoke
             hWnd = FindWindow(className, caption);

            /*bool retVal = false;
             if ((int)hWnd != 0)
             {
                // Win 32 API being called through PInvoke 
               retVal = SetForegroundWindow(hWnd);
             }*/



            if ((int)hWnd != 0)
             {
                 CloseWindow2(hWnd);
                 //CloseWindow(hWnd); // either sendMessage or PostMessage can be used.
             }
         }



        static bool CloseWindow(IntPtr hWnd)
         {
             // Win 32 API being called through PInvoke
             SendMessage(hWnd, WM_CLOSE, IntPtr.Zero, IntPtr.Zero);
             return true;
         }

        static bool CloseWindow2(IntPtr hWnd)
         {
             // Win 32 API being called through PInvoke
             PostMessage(hWnd, WM_CLOSE, IntPtr.Zero, IntPtr.Zero);
             return true;

         }

    }
 }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM