简体   繁体   English

Python Tweepy 获取所有基于 Geocode 的推文

[英]Python Tweepy get all tweets based on Geocode

I am trying to get all tweets within a certain radius around given coordinates.我试图在给定坐标周围的某个半径内获取所有推文。 The script actually works but zero entries are returned.该脚本实际上有效,但返回零条目。 The strange thing is that exactly the same code worked for me a few days ago and now it does not and I am stuck:(奇怪的是几天前完全相同的代码对我有用,现在它没有,我被卡住了:(

import tweepy
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import json
import pandas as pd
import tweepy

#Twitter credentials for the app
consumer_key = 'xxx'
consumer_secret = 'xxx'
access_key= 'xxx'
access_secret = 'xxx'

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth, wait_on_rate_limit=True)

#Create list for column names
COLS = ['id','created_at','lang','original text','user_name', 'place', 'place type', 'bbx', 'coordinates']

geo='48.136353, 11.575004, 25km'

def write_tweets(keyword):

    #create dataframe from defined column list
    df = pd.DataFrame(columns=COLS)

    #iterate through pages with given condition
    #using tweepy.Cursor object with items() method
    for page in tweepy.Cursor(api.search, q=keyword,
                                  include_rts=False,
                                  geocode=geo).pages():

                for tweet in page:
                    #creating string array
                    new_entry = []

                    #storing all JSON data from twitter API
                    tweet = tweet._json    

                    #Append the JSON parsed data to the string list:

                    new_entry += [tweet['id'], tweet['created_at'], tweet['lang'], tweet['text'], 
                                  tweet['user']['name']]

                    #check if place name is available, in case not the entry is named 'no place'
                    try:
                        place = tweet['place']['name']
                    except TypeError:
                        place = 'no place'
                    new_entry.append(place)

                    try:
                        place_type = tweet['place']['place_type']
                    except TypeError:
                        place_type = 'na'
                    new_entry.append(place_type)

                    try:
                        bbx = tweet['place']['bounding_box']['coordinates']
                    except TypeError:
                        bbx = 'na'
                    new_entry.append(bbx)

                    #check if coordinates is available, in case not the entry is named 'no coordinates'
                    try:
                        coord = tweet['coordinates']['coordinates']
                    except TypeError:
                        coord = 'no coordinates'
                    new_entry.append(coord)

                    # wrap up all the data into a data frame
                    single_tweet_df = pd.DataFrame([new_entry], columns=COLS)
                    df = df.append(single_tweet_df, ignore_index=True)

                    #get rid of tweets without a place
                    df_cleaned = df[df.place != 'no place']


    print("tweets with place:")
    print(len(df[df.place != 'no place']))

    print("tweets with coordinates:")
    print(len(df[df.coordinates != 'no coordinates']))

    df_cleaned.to_csv('tweets_'+geo+'.csv', columns=COLS,index=False)

#declare keywords as a query
keyword='*'

#call main method passing keywords and file path
write_tweets(keyword)

The geocode actually should work like this.地理编码实际上应该像这样工作。

Does anyone have an idea?有人有想法吗?

When you declare variable geo don't leave any spaces between the comma and the numbers.当您声明变量 geo 时,不要在逗号和数字之间留下任何空格。

It should look like this:它应该如下所示:

geo='48.136353,11.575004,25km'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM