如何从Python脚本捕获卷曲的输出

Question

我想使用curl查找有关网页的信息，但是到目前为止，在Python中，我有以下内容：

os.system("curl --head www.google.com")

如果我运行它，它会打印出：

HTTP/1.1 200 OK
Date: Sun, 15 Apr 2012 00:50:13 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: PREF=ID=3e39ad65c9fa03f3:FF=0:TM=1334451013:LM=1334451013:S=IyFnmKZh0Ck4xfJ4; expires=Tue, 15-Apr-2014 00:50:13 GMT; path=/; domain=.google.com
Set-Cookie: NID=58=Giz8e5-6p4cDNmx9j9QLwCbqhRksc907LDDO6WYeeV-hRbugTLTLvyjswf6Vk1xd6FPAGi8VOPaJVXm14TBm-0Seu1_331zS6gPHfFp4u4rRkXtSR9Un0hg-smEqByZO; expires=Mon, 15-Oct-2012 00:50:13 GMT; path=/; domain=.google.com; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Transfer-Encoding: chunked

我想做的是，可以使用正则表达式匹配其中的200个（我不需要帮助），但是，我找不到将上面所有文本转换为字符串的方法。 我怎么做？ 我试过了： info = os.system("curl --head www.google.com")但是info只是0 。

Answer 1

使用subprocess.Popen()尝试一下：

import subprocess
proc = subprocess.Popen(["curl", "--head", "www.google.com"], stdout=subprocess.PIPE)
(out, err) = proc.communicate()
print out

如文档中所述：

子流程模块允许您生成新流程，连接到其输入/输出/错误管道，并获取其返回代码。 该模块旨在替换其他一些较旧的模块和功能，例如：

os.system
os.spawn*
os.popen*
popen2.*
commands.*

Answer 2

出于某种原因...我需要使用curl（没有pycurl，httplib2 ...），也许这可以对某些人有所帮助：

import os
result = os.popen("curl http://google.es").read()
print result

Answer 3

import os
cmd = 'curl https://randomuser.me/api/'
os.system(cmd)

结果

{"results":[{"gender":"male","name":{"title":"mr","first":"çetin","last":"nebioğlu"},"location":{"street":"5919 abanoz sk","city":"adana","state":"kayseri","postcode":53537},"email":"çetin.nebioğlu@example.com","login":{"username":"heavyleopard188","password":"forgot","salt":"91TJOXWX","md5":"2b1124732ed2716af7d87ff3b140d178","sha1":"cb13fddef0e2ce14fa08a1731b66f5a603e32abe","sha256":"cbc252db886cc20e13f1fe000af1762be9f05e4f6372c289f993b89f1013a68c"},"dob":"1977-05-10 18:26:56","registered":"2009-09-08 15:57:32","phone":"(518)-816-4122","cell":"(605)-165-1900","id":{"name":"","value":null},"picture":{"large":"https://randomuser.me/api/portraits/men/38.jpg","medium":"https://randomuser.me/api/portraits/med/men/38.jpg","thumbnail":"https://randomuser.me/api/portraits/thumb/men/38.jpg"},"nat":"TR"}],"info":{"seed":"0b38b702ef718e83","results":1,"page":1,"version":"1.1"}}

Answer 4

尝试这个：

import httplib
conn = httplib.HTTPConnection("www.python.org")
conn.request("GET", "/index.html")
r1 = conn.getresponse()
print r1.status, r1.reason

Answer 5

您可以在Python中使用HTTP库或http客户端库，而不用调用curl命令。 实际上，您可以安装一个curl库（只要您的OS上有编译器）。

其他选择包括httplib2（推荐），它是一个相当完整的http协议客户端，它也支持缓存，或者只是简单的httplib或名为Request的库。

如果确实要运行curl命令并捕获其输出，则可以使用Popen在此处记录的内置子过程模块中进行此操作： http : //docs.python.org/library/subprocess.html

Answer 6

好吧，有一种更容易阅读但更混乱的方法。 这里是：

import os
outfile=''  #put your file path there
os.system("curl --head www.google.com>>{x}".format(x=str(outfile))  #Outputs command to log file (and creates it if it doesnt exist).
readOut=open("{z}".format(z=str(outfile),"r")  #Opens file in reading mode.
for line in readOut:
    print line  #Prints lines in file
readOut.close()  #Closes file
os.system("del {c}".format(c=str(outfile))  #This is optional, as it just deletes the log file after use.

这应该可以正常工作以满足您的需求。 :)

如何从Python脚本捕获卷曲的输出

问题描述

6 个解决方案

解决方案1
22 2012-04-15 01:02:50

解决方案2
17 2015-08-20 13:16:33

解决方案3
2 2016-08-31 18:44:47

结果

解决方案4
1 已采纳 2012-04-15 01:31:56

解决方案5
0 2012-04-15 01:08:39

解决方案6
0 2012-04-15 02:02:41

如何从Python脚本捕获卷曲的输出

问题描述

6 个解决方案

解决方案1 22 2012-04-15 01:02:50

解决方案2 17 2015-08-20 13:16:33

解决方案3 2 2016-08-31 18:44:47

结果

解决方案4 1 已采纳 2012-04-15 01:31:56

解决方案5 0 2012-04-15 01:08:39

解决方案6 0 2012-04-15 02:02:41

解决方案1
22 2012-04-15 01:02:50

解决方案2
17 2015-08-20 13:16:33

解决方案3
2 2016-08-31 18:44:47

解决方案4
1 已采纳 2012-04-15 01:31:56

解决方案5
0 2012-04-15 01:08:39

解决方案6
0 2012-04-15 02:02:41