使用 Python 与其他程序交互

Question

I'm having the idea of writing a program using Python which shall find a lyric of a song whose name I provided.我有使用 Python 编写程序的想法，该程序将找到我提供的歌曲名称的歌词。 I think the whole process should boil down to couple of things below.我认为整个过程应该归结为以下几件事。 These are what I want the program to do when I run it:这些是我希望程序在运行时执行的操作：

prompt me to enter a name of a song提示我输入歌曲名称
copy that name复制那个名字
open a web browser (google chrome for example)打开网络浏览器（例如谷歌浏览器）
paste that name in the address bar and find information about the song将该名称粘贴到地址栏中并查找有关该歌曲的信息
open a page that contains the lyrics打开一个包含歌词的页面
copy that lyrics复制那个歌词
run a text editor (like Microsoft Word for instance)运行文本编辑器（例如 Microsoft Word）
paste the lyrics粘贴歌词
save the new text file with the name of the song使用歌曲名称保存新的文本文件

I am not asking for code, of course.当然，我不是要代码。 I just want to know the concepts or ideas about how to use python to interact with other programs我只是想知道关于如何使用python与其他程序交互的概念或想法

To be more specific, I think I want to know, fox example, just how we point out where is the address bar in Google Chrome and tell python to paste the name there.更具体地说，我想我想知道，以狐狸为例，我们如何指出 Google Chrome 中的地址栏在哪里，并告诉 python 将名称粘贴到那里。 Or how we tell python how to copy the lyrics as well as paste it into the Microsof Word's sheet then save it.或者我们如何告诉python如何复制歌词并将其粘贴到Microsof Word的表格中然后保存。

I've been reading (I'm still reading) several books on Python: Byte of python, Learn python the hard way, Python for dummies, Beginning Game Development with Python and Pygame.我一直在阅读（我还在阅读）几本关于 Python 的书：Python 的字节、艰难地学习 Python、傻瓜的 Python、使用 Python 和 Pygame 开始游戏开发。 However, I found out that it seems like I only (or almost only) learn to creat programs that work on itself (I can't tell my program to do things I want with other programs that are already installed on my computer)然而，我发现我似乎只（或几乎只）学会了创建自己运行的程序（我不能告诉我的程序用我计算机上已经安装的其他程序做我想做的事情）

I know that my question somehow sounds rather silly, but I really want to know how it works, the way we tell Python to regconize that this part of the Google chrome browser is the address bar and that it should paste the name of the song in it.我知道我的问题在某种程度上听起来很愚蠢，但我真的想知道它是如何工作的，我们告诉 Python 的方式来 regconize 谷歌 chrome 浏览器的这一部分是地址栏，它应该将歌曲的名称粘贴到它。 The whole idea of making python interact with another program is really really vague to me and I just extremely want to grasp that.让 python 与另一个程序交互的整个想法对我来说真的很模糊，我只是非常想理解这一点。

Thank you everyone, whoever spend their time reading my so-long question.谢谢大家，无论是谁花时间阅读我这么长的问题。

ttriet204 ttriet204

Answer 1

If what you're really looking into is a good excuse to teach yourself how to interact with other apps, this may not be the best one.如果您真正在研究的是自学如何与其他应用程序交互的好借口，那么这可能不是最好的方法。 Web browsers are messy, the timing is going to be unpredictable, etc. So, you've taken on a very hard task—and one that would be very easy if you did it the usual way (talk to the server directly, create the text file directly, etc., all without touching any other programs). Web 浏览器是混乱的，时间将是不可预测的，等等。所以，你已经承担了一项非常艰巨的任务——如果你以通常的方式完成这项任务将会非常容易（直接与服务器对话，创建文本文件等，所有这些都没有触及任何其他程序）。

But if you do want to interact with other apps, there are a variety of different approaches, and which is appropriate depends on the kinds of apps you need to deal with.但是，如果您确实想与其他应用程序交互，则有多种不同的方法，哪种方法合适取决于您需要处理的应用程序类型。

Some apps are designed to be automated from the outside.某些应用程序旨在从外部实现自动化。 On Windows, this nearly always means they a COM interface, usually with an IDispatch interface, for which you can use pywin32 's COM wrappers;在 Windows 上，这几乎总是意味着它们是 COM 接口，通常带有 IDispatch 接口，您可以使用pywin32的 COM 包装器； on Mac, it means an AppleEvent interface, for which you use ScriptingBridge or appscript ;在 Mac 上，它表示 AppleEvent 接口，您使用ScriptingBridge或appscript ； on other platforms there is no universal standard.在其他平台上没有通用标准。 IE (but probably not Chrome) and Word both have such interfaces. IE（但可能不是 Chrome）和 Word 都有这样的界面。
Some apps have a non-GUI interface—whether that's a command line you can drive with popen , or a DLL/SO/DYLIB you can load up through ctypes .某些应用程序具有非 GUI 界面——无论是可以使用popen驱动的命令行，还是可以通过ctypes加载的 DLL/SO/DYLIB。 Or, ideally, someone else has already written Python bindings for you.或者，理想情况下，其他人已经为您编写了 Python 绑定。
Some apps have nothing but the GUI, and there's no way around doing GUI automation.一些应用程序只有 GUI，并且无法进行 GUI 自动化。 You can do this at a low level, by crafting WM_ messages to send via pywin32 on Windows, using the accessibility APIs on Mac, etc., or at a somewhat higher level with libraries like pywinauto , or possibly at the very high level of selenium or similar tools built to automate specific apps.您可以在低级别执行此操作，方法是制作 WM_ 消息以通过 Windows 上的pywin32发送，使用 Mac 上的可访问性 API 等，或者在更高级别使用pywinauto库，或者可能在非常高级别的selenium或用于自动化特定应用程序的类似工具。

So, you could do this with anything from selenium for Chrome and COM automation for Word, to crafting all the WM_ messages yourself.因此，您可以使用从 Chrome 的 selenium 和 Word 的 COM 自动化到自己制作所有 WM_ 消息的任何东西来做到这一点。 If this is meant to be a learning exercise, the question is which of those things you want to learn today.如果这是一个学习练习，问题是你今天想学习哪些东西。

Let's start with COM automation.让我们从 COM 自动化开始。 Using pywin32 , you directly access the application's own scripting interfaces, without having to take control of the GUI from the user, figure out how to navigate menus and dialog boxes, etc. This is the modern version of writing "Word macros"—the macros can be external scripts instead of inside Word, and they don't have to be written in VB, but they look pretty similar.使用pywin32 ，您可以直接访问应用程序自己的脚本界面，而无需从用户那里控制 GUI，弄清楚如何导航菜单和对话框等。这是编写“Word 宏”的现代版本——宏可以是外部脚本而不是 Word 内部，它们不必用 VB 编写，但它们看起来非常相似。 The last part of your script would look something like this:脚本的最后一部分如下所示：

word = win32com.client.dispatch('Word.Application')
word.Visible = True
doc = word.Documents.Add()
doc.Selection.TypeText(my_string)
doc.SaveAs(r'C:\TestFiles\TestDoc.doc')

If you look at Microsoft Word Scripts , you can see a bunch of examples.如果您查看Microsoft Word Scripts ，您会看到一堆示例。 However, you may notice they're written in VBScript.但是，您可能会注意到它们是用 VBScript 编写的。 And if you look around for tutorials, they're all written for VBScript (or older VB).如果您四处寻找教程，它们都是为 VBScript（或更旧的 VB）编写的。 And the documentation for most apps is written for VBScript (or VB, .NET, or even low-level COM).大多数应用程序的文档都是为 VBScript（或 VB、.NET，甚至低级 COM）编写的。 And all of the tutorials I know of for using COM automation from Python, like Quick Start to Client Side COM and Python , are written for people who already know about COM automation, and just want to know how to do it from Python.我所知道的所有从 Python 中使用 COM 自动化的教程，如快速入门到客户端 COM 和 Python ，都是为已经了解 COM 自动化并且只想知道如何从 Python 中做到这一点的人编写的。 The fact that Microsoft keeps changing the name of everything makes it even harder to search for—how would you guess that googling for OLE automation, ActiveX scripting, Windows Scripting House, etc. would have anything to do with learning about COM automation? Microsoft 不断更改所有内容的名称这一事实使得搜索变得更加困难——您如何猜测谷歌搜索 OLE 自动化、ActiveX 脚本、Windows Scripting House 等与了解 COM 自动化有什么关系？ So, I'm not sure what to recommend for getting started.所以，我不确定推荐什么来开始。 I can promise that it's all as simple as it looks from that example above, once you do learn all the nonsense, but I don't know how to get past that initial hurdle.我可以保证，一旦你学会了所有的废话，一切就和上面那个例子一样简单，但我不知道如何克服最初的障碍。

Anyway, not every application is automatable.无论如何，并非每个应用程序都是可自动化的。 And sometimes, even if it is, describing the GUI actions (what a user would click on the screen) is simpler than thinking in terms of the app's object model.有时，即使是这样，描述 GUI 操作（用户将在屏幕上单击的内容）比考虑应用程序的对象模型更简单。 "Select the third paragraph" is hard to describe in GUI terms, but "select the whole document" is easy—just hit control-A, or go to the Edit menu and Select All. “选择第三段”很难用 GUI 术语来描述，但“选择整个文档”很容易——只需按 control-A，或转到“编辑”菜单并“全选”即可。 GUI automation is much harder than COM automation, because you either have to send the app the same messages that Windows itself sends to represent your user actions (eg, see " Menu Notifications ") or, worse, craft mouse messages like "go (32, 4) pixels from the top-left corner, click, mouse down 16 pixels, click again" to say "open the File menu, then click New". GUI 自动化比 COM 自动化要困难得多，因为您要么必须向应用程序发送 Windows 本身发送的相同消息来表示您的用户操作（例如，请参阅“ 菜单通知”），或者更糟的是，制作鼠标消息，例如“go (32 , 4) 从左上角起像素，单击，鼠标向下移动 16 像素，再次单击”以说“打开文件菜单，然后单击新建”。

Fortunately, there are tools like pywinauto that wrap up both kinds of GUI automation stuff up to make it a lot simpler.幸运的是，有像pywinauto这样的工具可以将这两种 GUI 自动化内容打包起来，使其变得更加简单。 And there are tools like swapy that can help you figure out what commands you want to send.还有像swapy这样的工具可以帮助您确定要发送的命令。 If you're not wedded to Python, there are also tools like AutoIt and Actions that are even easier than using swapy and pywinauto , at least when you're getting started.如果您不喜欢 Python，那么还有AutoIt和Actions等工具，它们比使用swapy和pywinauto更容易，至少在您刚开始时是这样。 Going this way, the last part of your script might look like:按照这种方式，脚本的最后一部分可能如下所示：

word.Activate()
word.MenuSelect('File->New')
word.KeyStrokes(my_string)
word.MenuSelect('File->Save As')
word.Dialogs[-1].FindTextField('Filename').Select()
word.KeyStrokes(r'C:\TestFiles\TestDoc.doc')
word.Dialogs[-1].FindButton('OK').Click()

Finally, even with all of these tools, web browsers are very hard to automate, because each web page has its own menus, buttons, etc. that aren't Windows controls, but HTML.最后，即使使用所有这些工具，Web 浏览器也很难实现自动化，因为每个网页都有自己的菜单、按钮等，它们不是 Windows 控件，而是 HTML。 Unless you want to go all the way down to the level of "move the mouse 12 pixels", it's very hard to deal with these.除非你想一直到“鼠标移动12个像素”的水平，否则很难处理这些。 That's where selenium comes in—it scripts web GUIs the same way that pywinauto scripts Windows GUIs.这就是selenium用武之地——它以与pywinauto编写 Windows GUI 脚本相同的方式编写 Web GUI。

Answer 2

The following script uses Automa to do exactly what you want (tested on Word 2010):以下脚本使用Automa执行您想要的操作（在 Word 2010 上测试）：

def find_lyrics():
    print 'Please minimize all other open windows, then enter the song:'
    song = raw_input()
    start("Google Chrome")
    # Disable Google's autocompletion and set the language to English:
    google_address = 'google.com/webhp?complete=0&hl=en'
    write(google_address, into="Address")
    press(ENTER)
    write(song + ' lyrics filetype:txt')
    click("I'm Feeling Lucky")
    press(CTRL + 'a', CTRL + 'c')
    press(ALT + F4)
    start("Microsoft Word")
    press(CTRL + 'v')
    press(CTRL + 's')
    click("Desktop")
    write(song + ' lyrics', into="File name")
    click("Save")
    press(ALT + F4)
    print("\nThe lyrics have been saved in file '%s lyrics' "
          "on your desktop." % song)

To try it out for yourself, download Automa.zip from its Download page and unzip into, say, c:\\Program Files .要亲自试用，请从其下载页面下载 Automa.zip 并解压缩到，例如， c:\\Program Files 。 You'll get a folder called Automa 1.1.2 .您将获得一个名为Automa 1.1.2的文件夹。 Run Automa.exe in that folder.在该文件夹中运行Automa.exe 。 Copy the code above and paste it into Automa by right-clicking into the console window.复制上面的代码并通过右键单击控制台窗口将其粘贴到 Automa 中。 Press Enter twice to get rid of the last ... in the window and arrive back at the prompt >>> .按 Enter 两次以删除窗口中的最后一个...并返回到提示>>> 。 Close all other open windows and type关闭所有其他打开的窗口并键入

>>> find_lyrics()

This performs the required steps.这将执行所需的步骤。

Automa is a Python library : To use it as such, you have to add the line Automa 是一个 Python 库：要使用它，您必须添加以下行

from automa.api import *

to the top of your scripts and the file library.zip from Automa's installation directory to your environment variable PYTHONPATH .从 Automa 的安装目录到您的脚本和文件library.zip的顶部到您的环境变量PYTHONPATH 。

If you have any other questions, just let me know :-)如果您有任何其他问题，请告诉我 :-)

Answer 3

Here's an implementation in Python of @Matteo Italia's comment :这是@Matteo Italia 评论的Python 实现：

You are approaching the problem from a "user perspective" when you should approach it from a "programmer perspective";当您应该从“程序员的角度”处理问题时，您是从“用户的角度”处理问题； you don't need to open a browser, copy the text, open Word or whatever, you need to perform the appropriate HTTP requests, parse the relevant HTML, extract the text and write it to a file from inside your Python script.您不需要打开浏览器、复制文本、打开 Word 或其他任何东西，您需要执行适当的 HTTP 请求、解析相关的 HTML、提取文本并将其从 Python 脚本内部写入文件。 All the tools to do this are available in Python (in particular you'll need urllib2 and BeautifulSoup). Python 中提供了执行此操作的所有工具（特别是您需要 urllib2 和 BeautifulSoup）。

#!/usr/bin/env python
import codecs
import json
import sys
import urllib
import urllib2

import bs4  # pip install beautifulsoup4

def extract_lyrics(page):
    """Extract lyrics text from given lyrics.wikia.com html page."""
    soup = bs4.BeautifulSoup(page)
    result = []
    for tag in soup.find('div', 'lyricbox'):
        if isinstance(tag, bs4.NavigableString):
            if not isinstance(tag, bs4.element.Comment):
                result.append(tag)
        elif tag.name == 'br':
            result.append('\n')
    return "".join(result)

# get artist, song to search
artist = raw_input("Enter artist:")
song = raw_input("Enter song:")

# make request
query = urllib.urlencode(dict(artist=artist, song=song, fmt="realjson"))
response = urllib2.urlopen("http://lyrics.wikia.com/api.php?" + query)
data = json.load(response)

if data['lyrics'] != 'Not found':
    # print short lyrics
    print(data['lyrics'])
    # get full lyrics
    lyrics = extract_lyrics(urllib2.urlopen(data['url']))
    # save to file
    filename = "[%s] [%s] lyrics.txt" % (data['artist'], data['song'])
    with codecs.open(filename, 'w', encoding='utf-8') as output_file:
        output_file.write(lyrics)
    print("written '%s'" % filename)
else:
    sys.exit('not found')

Example例子

$ printf "Queen\nWe are the Champions" | python get-lyrics.py

Output输出

I've paid my dues
Time after time
I've done my sentence
But committed no crime

And bad mistakes
I've made a few
I've had my share of sand kicked [...]
written '[Queen] [We are the Champions] lyrics.txt'

Answer 4

If you really want to open a browser, etc, look at selenium .如果您真的想打开浏览器等，请查看selenium 。 But that's overkill for your purposes.但这对您的目的来说太过分了。 Selenium is used to simulate button clicks, etc for testing the appearance of websites on various browsers, etc. Mechanize is less of an overkill for this Selenium 用于模拟按钮点击等，以测试网站在各种浏览器上的外观等。 Mechanize对此不那么矫枉过正

What you really want to do is understand how a browser (or any other program) works under the hood ie when you click on the mouse or type on the keyboard or hit Save , what does the program do behind the scenes?您真正想要做的是了解浏览器（或任何其他程序）在幕后如何工作，即当您单击鼠标或在键盘上键入或点击Save ，该程序在幕后做了什么？ It is this behind-the-scenes work that you want your python code to do.这是您希望 Python 代码完成的幕后工作。

So, use urllib , urllib2 or requests (or heck, even scrapy ) to request a web page (learn how to put together the url to a google search or the php GET request of a lyrics website).因此，请使用urllib 、 urllib2或requests （或见鬼，甚至是scrapy ）来请求网页（了解如何将 url 组合到谷歌搜索或歌词网站的 php GET请求）。 Google also has a search API that you can take advantage of, to perform a google search. Google 还有一个搜索 API ，您可以利用它来执行 Google 搜索。

Once you have your results from your page request, parse it with xml , beautifulsoup , lxlml , etc and find the section of the request result that has the information you're after.从页面请求中获得结果后，使用xml 、 beautifulsoup 、 lxlml等对其进行解析，然后找到请求结果中包含您需要的信息的部分。

Now that you have your lyrics, the simplest thing to do is open a text file and dump the lyrics in there and write to disk.既然您有了歌词，最简单的做法就是打开一个文本文件，将歌词转储到其中并写入磁盘。 But if you really want to do it with MS Word, then open a doc file in notepad or notepad++ and look at its structure.但如果你真的想用 MS Word 来做，那么在记事本或记事本++中打开一个doc文件并查看它的结构。 Now, use python to build a document with similar structure, wherein the content will be the downloaded lyrics.现在，使用 python 构建一个类似结构的文档，其中的内容将是下载的歌词。
If this method fails, you could look into pywinauto or such to automate the pasting of text into an MS Word doc and clicking on Save如果此方法失败，您可以查看pywinauto等以自动将文本粘贴到 MS Word 文档中，然后单击“ Save

Citation: Matteo Italia, gddc from the comments on the OP引文：Matteo Italia，gddc 来自对 OP 的评论

Answer 5

您应该查看一个名为selenium的包以与 Web 浏览器进行交互

使用 Python 与其他程序交互

问题描述

5 个解决方案

解决方案1
42 已采纳 2013-01-11 23:32:24

解决方案2
15 2013-01-13 14:35:54

解决方案3
12 2013-01-15 12:42:48

Example例子

Output输出

解决方案4
5 2013-01-11 23:28:58

解决方案5
1 2013-01-11 23:16:20

使用 Python 与其他程序交互

问题描述

5 个解决方案

解决方案1 42 已采纳 2013-01-11 23:32:24

解决方案2 15 2013-01-13 14:35:54

解决方案3 12 2013-01-15 12:42:48

Example例子

Output输出

解决方案4 5 2013-01-11 23:28:58

解决方案5 1 2013-01-11 23:16:20

解决方案1
42 已采纳 2013-01-11 23:32:24

解决方案2
15 2013-01-13 14:35:54

解决方案3
12 2013-01-15 12:42:48

解决方案4
5 2013-01-11 23:28:58

解决方案5
1 2013-01-11 23:16:20