简体   繁体   中英

Python: Reading Tweets from file, no user info

I'm new to python and playing with tweets.

This may be stupid, but I'm running into problems with the tweets user entities. It seems like I am reading the .json file in ok to a dict. I am able to print out tweets in different locations in the dict, such as: tweet = tweets[7] print (tweet).

code 1

When I print print (tweet['entities']) I see hashtags and that kind of stuff. When I print user keys, they show up.

code 2

But, the problem is when I try to iterate through & print out the contents of user stuff like screen name, location, etc

code3

statuses = []
for tweet in tweets:
    statuses.append(tweet['text']) 

ids = []
for tweet in tweets:
    ids.append(tweet['id_str'])

times = []
for tweet in tweets:
    times.append(tweet['user']['created_at'])

screen_names = []
for tweet in tweets:
    screen_names.append (tweet['user']['screen_name'])

Is it a problem with reading in the json? Here is a link to what the tweets look like in the file https://drive.google.com/file/d/1u2soBW4PnRTuCMLdKODf4oKl9Zz88D7l/view?usp=sharing Am I just dumb and doing something ridiculously wrong? Thanks so much for helping a newbie. I looked around and couldnt find anything about the user stuff not being there. Sorry if I didnt give all the info needed to help.

Code:

    import json

    import tweepy

    import twitter

    tweets = []

    for line in open('tweets5.json'):

      try: 
        tweets.append(json.loads(line))

    except:
        pass

    print (len(tweets))

    tweet = tweets[2]

    print (tweet)

    print (tweet['entities'])

    print (tweet['user'].keys())

    print (tweet['text'])

    print (tweet['id_str'])

    print (tweet['user']['name'])

    print (tweet['user']['location'])
    print (tweet['place'])
    print(tweet['user']['screen_name'])
    print (tweet['geo'])
    print(tweet['user']['created_at'])

    print (tweet['favorite_count'])
    print (tweet['retweet_count'])


 **//errors start here I get a key error for id_str when I try to do IDs, so I tried moving things around to see if any of them would go through successfully & move on, but I get a key error for whatever is in the brackets first.

statuses = []
for tweet in tweets:
    statuses.append(tweet['text']) 

ids = []
for tweet in tweets:
    ids.append(tweet['id_str'])

times = []
for tweet in tweets:
    times.append(tweet['user']['created_at'])

screen_names = []
for tweet in tweets:
    screen_names.append (tweet['user']['screen_name']) 

This is what the tweet looks like:

{"created_at":"Mon Feb 19 04:08:20 +0000 2018","id":965437872395472896,"id_str":"965437872395472896","text":"RT @IronStache: .@SpeakerRyan spent all weekend in Florida, but he wouldn\u2019t even visit Parkland to offer comfort and support. \n\nPaul Ryan w\u2026","source":"\u003ca href=\"https:\/\/mobile.twitter.com\" rel=\"nofollow\"\u003eTwitter Lite\u003c\/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":2510706061,"id_str":"2510706061","name":"Bill McClure","screen_name":"RealBillMcClure","location":"Boca Raton, FL","url":null,"description":"Buck & El's kid, husband, dad, friend of the show. Everything I ever needed to know was learned on the streets of Youngstown, OHIO.","translator_type":"none","protected":false,"verified":false,"followers_count":576,"friends_count":1200,"listed_count":5,"favourites_count":2324,"statuses_count":5541,"created_at":"Tue May 20 15:28:42 +0000 2014","utc_offset":null,"time_zone":null,"geo_enabled":true,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"C0DEED","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":false,"profile_link_color":"3B94D9","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/864933232118312960\/kqhxw1LB_normal.jpg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/864933232118312960\/kqhxw1LB_normal.jpg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/2510706061\/1504194378","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"retweeted_status":{"created_at":"Mon Feb 19 00:45:19 +0000 2018","id":965386782471872512,"id_str":"965386782471872512","text":".@SpeakerRyan spent all weekend in Florida, but he wouldn\u2019t even visit Parkland to offer comfort and support. \n\nPau\u2026 https:\/\/t.co\/t2H4QGJ0s7","source":"\u003ca href=\"http:\/\/twitter.com\/download\/iphone\" rel=\"nofollow\"\u003eTwitter for iPhone\u003c\/a\u003e","truncated":true,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":292083207,"id_str":"292083207","name":"Randy Bryce","screen_name":"IronStache","location":"Caledonia, WI","url":"http:\/\/www.randybryceforcongress.com","description":"Democrat for WI-01. Father, Army veteran, Ironworker, Cancer survivor, Wisconsin4ever. Most tweets mine.","translator_type":"none","protected":false,"verified":true,"followers_count":234147,"friends_count":4069,"listed_count":1130,"favourites_count":21552,"statuses_count":26855,"created_at":"Tue May 03 02:27:46 +0000 2011","utc_offset":-21600,"time_zone":"Central Time (US & Canada)","geo_enabled":true,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"C0DEED","profile_background_image_url":"http:\/\/pbs.twimg.com\/profile_background_images\/667813427\/181a97f32cdd68a3fb6a8e744a68a1b7.png","profile_background_image_url_https":"https:\/\/pbs.twimg.com\/profile_background_images\/667813427\/181a97f32cdd68a3fb6a8e744a68a1b7.png","profile_background_tile":true,"profile_link_color":"1B95E0","profile_sidebar_border_color":"000000","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/878715837259153410\/LQ0QNz2y_normal.jpg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/878715837259153410\/LQ0QNz2y_normal.jpg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/292083207\/1497557512","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"extended_tweet":{"full_text":".@SpeakerRyan spent all weekend in Florida, but he wouldn\u2019t even visit Parkland to offer comfort and support. \n\nPaul Ryan won\u2019t do anything to prevent future attacks. We have to replace him with a Congressperson who will. https:\/\/t.co\/PBFoQ6IHfv","display_text_range":[0,245],"entities":{"hashtags":[],"urls":[{"url":"https:\/\/t.co\/PBFoQ6IHfv","expanded_url":"http:\/\/www.palmbeachpost.com\/news\/trump-palm-beach-mar-lago-meeting-with-house-speaker-paul-ryan\/QisKep1LXdV5zQRZWQ5OLO\/","display_url":"palmbeachpost.com\/news\/trump-pal\u2026","indices":[222,245]}],"user_mentions":[{"screen_name":"SpeakerRyan","name":"Paul Ryan","id":18916432,"id_str":"18916432","indices":[1,13]}],"symbols":[]}},"quote_count":191,"reply_count":379,"retweet_count":4375,"favorite_count":11011,"entities":{"hashtags":[],"urls":[{"url":"https:\/\/t.co\/t2H4QGJ0s7","expanded_url":"https:\/\/twitter.com\/i\/web\/status\/965386782471872512","display_url":"twitter.com\/i\/web\/status\/9\u2026","indices":[117,140]}],"user_mentions":[{"screen_name":"SpeakerRyan","name":"Paul Ryan","id":18916432,"id_str":"18916432","indices":[1,13]}],"symbols":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"low","lang":"en"},"is_quote_status":false,"quote_count":0,"reply_count":0,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[],"urls":[],"user_mentions":[{"screen_name":"IronStache","name":"Randy Bryce","id":292083207,"id_str":"292083207","indices":[3,14]},{"screen_name":"SpeakerRyan","name":"Paul Ryan","id":18916432,"id_str":"18916432","indices":[17,29]}],"symbols":[]},"favorited":false,"retweeted":false,"filter_level":"low","lang":"en","timestamp_ms":"1519013300404"}

I think that this line

print (['user']['screen_name'])

Should be:

print(tweet['user']['screen_name'])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM