简体   繁体   English

从Python中的特定URL收集推文

[英]Scraping Tweets from a Specific URL in Python

I've been testing out the following Python code: 我一直在测试以下Python代码:

import oauth2 as oauth
import urllib2 as urllib

api_key = "a"
api_secret = "b"
access_token_key = "c"
access_token_secret = "d"

_debug = 0

oauth_token    = oauth.Token(key=access_token_key, secret=access_token_secret)
oauth_consumer = oauth.Consumer(key=api_key, secret=api_secret)

signature_method_hmac_sha1 = oauth.SignatureMethod_HMAC_SHA1()

http_method = "GET"


http_handler  = urllib.HTTPHandler(debuglevel=_debug)
https_handler = urllib.HTTPSHandler(debuglevel=_debug)

'''
Construct, sign, and open a twitter request
using the hard-coded credentials above.
'''
def twitterreq(url, method, parameters):
  req = oauth.Request.from_consumer_and_token(oauth_consumer,
                                             token=oauth_token,
                                             http_method=http_method,
                                             http_url=url, 
                                             parameters=parameters)

  req.sign_request(signature_method_hmac_sha1, oauth_consumer, oauth_token)

  headers = req.to_header()

  if http_method == "POST":
    encoded_post_data = req.to_postdata()
  else:
    encoded_post_data = None
    url = req.to_url()

  opener = urllib.OpenerDirector()
  opener.add_handler(http_handler)
  opener.add_handler(https_handler)

  response = opener.open(url, encoded_post_data)

  return response

def fetchsamples():
  url = "myURL"
  parameters = []
  response = twitterreq(url, "GET", parameters)
  for line in response:
    print line.strip()

if __name__ == '__main__':
  fetchsamples()

The goal, as I understand it, of this code is to fetch tweets from a specific URL 'myURL'. 据我了解,此代码的目标是从特定URL“ myURL”中获取推文。 However, when I run it I get a whole bunch of front end HTML and JavaScript, but no tweets. 但是,当我运行它时,我得到了一堆前端HTML和JavaScript,但没有任何鸣叫。 Am I mistaken about the purpose of this code? 我是否误解了此代码的用途? Is there a better way to do what I'm trying to do? 有没有更好的方法来做我想做的事情?

I face the same issue when doing the twitter assignment in a course from Coursera. 在Coursera的课程中进行Twitter分配时,我遇到相同的问题。 Finally I need to look at doc in twitter https://dev.twitter.com/oauth/overview/single-user 最后,我需要查看Twitter https://dev.twitter.com/oauth/overview/single-user中的文档

response = oauth_req(URL, access_token_key, access_token_secret )

I doubt SignatureMethod_HMAC_SHA1 is not necessary. 我怀疑SignatureMethod_HMAC_SHA1是否不必要。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM