简体   繁体   English

Tweepy位置过滤器不起作用

[英]Tweepy Location Filter Does Not Work

PROBLEM SOLVED, SEE SOLUTION IN THE ACCEPTED POST 解决了问题,在接受的帖子中找到解决方案

I am trying to collect 50 tweets that originate from a specified geographic region. 我正在尝试收集来自指定地理区域的50条推文。 My code below will print 50 tweets, but a lot of them have "NONE" for coordinates. 我下面的代码将打印50条推文,但其中很多都有“NONE”的坐标。 Does this mean that these tweet with "NONE" is not generated from the specified area? 这是否意味着这些带有“NONE”的推文不是从指定区域生成的? Can you explain what is happening here? 你能解释一下这里发生了什么吗? And how to collect 50 tweets from this specified geographic area? 如何从这个指定的地理区域收集50条推文? Thanks in advance. 提前致谢。

# Import Tweepy, sys, sleep, credentials.py
try:
    import json
except ImportError:
    import simplejson as json
import tweepy, sys
from time import sleep
from credentials import *

# Access and authorize our Twitter credentials from credentials.py
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)

# Assign coordinates to the variable
box = [-74.0,40.73,-73.0,41.73]

#override tweepy.StreamListener to add logic to on_status
class MyStreamListener(tweepy.StreamListener):
    def __init__(self, api=None):
        super(MyStreamListener, self).__init__()
        self.counter = 0

    def on_status(self, status):
        record = {'Text': status.text, 'Coordinates': status.coordinates, 'Created At': status.created_at}
        self.counter += 1
        if self.counter <= 50:
            print record
            return True
        else:
            return False

    def on_error(self, status_code):
        if status_code == 420:
            #returning False in on_data disconnects the stream
            return False

myStreamListener = MyStreamListener()
myStream = tweepy.Stream(api.auth, listener=myStreamListener)
myStream.filter(locations=box, async=True)
print myStream

Here is the result: 结果如下:

{'Text': u"What?...", 'Created At': datetime.datetime(2017, 3, 12, 2, 55, 6), 'Coordinates': {u'type': u'Point', u'coordinates': [-74.
1234567, 40.1234567]}}
{'Text': u'WHEN?...', 'Created A
t': datetime.datetime(2017, 3, 12, 2, 55, 8), 'Coordinates': None}
{'Text': u'Wooo...', 'Created At': datetime.datetime(2017, 3, 12, 2, 55, 9), 'Coordinates': None}
{'Text': u'Man...', 'Created At': datetime.datetime(2017, 3, 12, 2, 55, 9), 'Coordina
tes': None}
{'Text': u'The...', 'Created At': datetime.datetime(201
7, 3, 12, 2, 55, 10), 'Coordinates': None}

From the docs: 来自文档:

Only geolocated Tweets falling within the requested bounding boxes will be included—unlike the Search API, the user's location field is not used to filter Tweets. 将包含属于请求的边界框内的地理定位推文 - 与Search API不同,用户的位置字段不用于过滤推文。

That guarentees that the tweets in the response are from the bounding box provided. 保证响应中的推文来自提供的边界框。

How does the bounding box filter work? 边界框过滤器如何工作?

The streaming API uses the following heuristic to determine whether a given Tweet falls within a bounding box: 流API使用以下启发式方法来确定给定的Tweet是否属于边界框:

  • If the coordinates field is populated, the values there will be tested against the bounding box. 如果填充了坐标字段,则将针对边界框测试其中的值。 Note that this field uses geoJSON order (longitude, latitude). 请注意,此字段使用geoJSON顺序(经度,纬度)。

  • If coordinates is empty but place is populated,the region defined in place is checked for intersection against the locations bounding box. 如果坐标为空但填充了地点,则检查就地定义的区域是否与位置边界框相交。 Any overlap will match. 任何重叠都会匹配。 If none of the rules listed above match, the Tweet does not match the location query. 如果上面列出的规则都不匹配,则推文与位置查询不匹配。

Again, this implies that the coordinates field can be None but that the bbox filter is guaranteed to return tweets from the bounding box region 同样,这意味着坐标字段可以是None,但保证bbox过滤器从边界框区域返回推文

source: https://dev.twitter.com/streaming/overview/request-parameters#locations 来源: https//dev.twitter.com/streaming/overview/request-parameters#locations

Edit : place is a field in the response similar to coordinates . 编辑place是响应中与coordinates类似的字段。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM