简体   繁体   English

我可以请求/过滤Twitter流API以仅返回带有地理标记的推文吗?

[英]Can I request/filter the twitter streaming api to return only tweets with geotags?

I'm using the twitter4j library to access the public twitter stream. 我正在使用twitter4j库访问公共twitter流。 I'm trying to make a project involving geotagged tweets, and I need to collect a large number of them for testing. 我正在尝试制作涉及地理标记推文的项目,我需要收集大量的推文进行测试。

Right now I am getting the unfiltered stream from twitter and only saving tweets with geotags. 现在我从twitter获得未经过滤的流,只保存带地理标记的推文。 This is slow though because the VAST majority of tweets don't have geo tags. 这很慢,因为VAST的大部分推文都没有地理标签。 I want the twitter stream to send me only tweets with geotags. 我希望twitter流只向我发送带有地理标记的推文。

I have tried using the method mentioned in this question , where you filter with a bounding box of size 360* by 180* but that's not working for me. 我已经尝试过使用这个问题中提到的方法,你用一个大小为360 * 180 *的边界框进行过滤,但这对我不起作用。 I'm not getting any errors when using that filter, but I'm still getting 99% of tweets with no geotags. 使用该过滤器时我没有收到任何错误,但我仍然得到99%没有地理标记的推文。 Here is how I'm doing it: 我是这样做的:

ConfigurationBuilder cb = new ConfigurationBuilder();
    cb.setDebugEnabled(true)
    .setOAuthConsumerKey("censored")
    .setOAuthConsumerSecret("censored")
    .setOAuthAccessToken("censored")
    .setOAuthAccessTokenSecret("censored");

TwitterStream twitterStream = newTwitterStreamFactory(cb.build()).getInstance();
StatusListener listener = new MyStatusListener();
twitterStream.addListener(listener);

//add location filter for what I hope is the whole planet. Just trying to limit
//results to only things that are geotagged
FilterQuery locationFilter = new FilterQuery();
double[][] locations = {{-180.0d,-90.0d},{180.0d,90.0d}};

locationFilter.locations(locations);

twitterStream.filter(locationFilter);

twitterStream.sample();

Any suggestions about why I'm still getting tweets with no geotags? 有关为什么我仍然收到没有地理标记的推文的任何建议?

Edit: I just reread the twitter4j javadoc on adding filters to a twitter stream, and it says "The default access level allows up to 200 track keywords, 400 follow userids and 10 1-degree location boxes." 编辑:我只是重新阅读twitter4j javadoc,在Twitter流中添加过滤器,并说“默认访问级别允许最多200个跟踪关键字,400个跟随用户标识和10个1度位置框。” So bounding boxes may only be 1 degree wide? 边界框可能只有1度宽? That's different from the original information I came across. 这与我遇到的原始信息不同。 Is that my problem? 那是我的问题吗? My filter request is too big so it's being ignored? 我的过滤请求太大,以至于被忽略了? I'm not getting any errors when trying to use it. 尝试使用它时我没有收到任何错误。

getting from filter stream then overwriting it with sample stream. 从过滤器流获取然后用样本流覆盖它。

remove the last line : twitterStream.sample(); 删除最后一行: twitterStream.sample();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM