简体   繁体   中英

Extract tweets by specifying latitude and longitude

I'm extracting twitter data by specifying longitude, latitude, and range. However, I want to extract tweets from a rectangular area specifying two coordinate pairs of latitude and longitude.

Code:

from twitter import *

import sys
import csv

latitude = 51.474144    # geographical centre of search
longitude = -0.035401    # geographical centre of search
max_range = 1             # search range in kilometres
num_results = 1000        # minimum results to obtain
outfile = "output.csv"

import sys
sys.path.append(".")
import config

consumer_key = '*************************'
consumer_secret = '*******************************'
access_key = '***************************************'
access_secret = '*****************************'


twitter = Twitter(auth = OAuth(access_key, access_secret, consumer_key, 
consumer_secret))

csvfile = open(outfile, "w")
csvwriter = csv.writer(csvfile)

row = [ "user", "text", "latitude", "longitude" ]
csvwriter.writerow(row)

result_count = 0
last_id = None
while result_count <  num_results:
 query = twitter.search.tweets(q = "", geocode = "%f,%f,%dkm" % (latitude, 
 longitude, max_range), count = 1000, max_id = last_id)

 for result in query["statuses"]:
    if result["geo"]:
        user = result["user"]["screen_name"]
        text = result["text"]
        text = text.encode('ascii', 'replace')
        latitude = result["geo"]["coordinates"][0]
        longitude = result["geo"]["coordinates"][1]

        row = [ user, text, latitude, longitude ]
        csvwriter.writerow(row)
        result_count += 1
    last_id = result["id"]

  print("got %d results" % result_count)

csvfile.close()

print("written to %s" % outfile)

Any help will be highly appreciated.

If Twitter only allowed you to search within a circle of specified radius, it appears the only practical way to do this would to compute the centre and diagonal radius of your rectangle, then retrieve those and exclude the ones that lie outside your chosen rectangle (which you can do by verifying each axis individually).

Fortunately they also implement a search by bounding box (see this documentation page ).

Studying https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/geo-objects may help you in formulating your request.

from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
import time
import json
import re
from urllib3.exceptions import ProtocolError

access_token = 'xxxx'
access_token_secret = 'xxxx'
consumer_key = 'xxxx'
consumer_secret = 'xxxx'

class StdOutListener(StreamListener):

    def on_data(self, data):
        print (data)
        return True

    def on_error(self, status):
        print ('Encountered error with status code:',status)


if __name__ == '__main__':

    #This handles Twitter authetification and the connection to Twitter Streaming API
    l = StdOutListener()
    auth = OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_token_secret)
    stream = Stream(auth, l)
    while True:
        try:
            stream.filter(locations = [144.9385,-37.8246,144.9761,-37.7955], stall_warnings=True)
        except (ProtocolError, AttributeError):
            continue

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM