简体   繁体   中英

Twitter Analysis in Java

I have a zip file with the ids of the Twitter stream S. How can I load these ids and then complete the dataset by downloading the original tweets of the dataset using Java and Lucene? For reduce the space and the complexity of the dataset it is required to download at least 5% of tweets in a uniform way, check that the tweet is in English and store the data in a compressed form.

You can use Twitter4j library in order to get tweets by IDs. So, stream the provided IDs inside the file and then download tweets by these IDs through Twitter4j. Ofcourse, if you need 5% only, you'll get a subset of the tweets

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM