简体   繁体   English

Python:从文件读取推文,无用户信息

[英]Python: Reading Tweets from file, no user info

I'm new to python and playing with tweets. 我是python的新手,正在玩推文。

This may be stupid, but I'm running into problems with the tweets user entities. 这可能很愚蠢,但是我在推文用户实体中遇到了问题。 It seems like I am reading the .json file in ok to a dict. 看来我正在按dict读取.json文件。 I am able to print out tweets in different locations in the dict, such as: tweet = tweets[7] print (tweet). 我能够在字典的不同位置打印出tweet,例如:tweet = tweets [7] print(tweet)。

code 1 代码1

When I print print (tweet['entities']) I see hashtags and that kind of stuff. 当我打印print(tweet ['entities'])时,我看到标签和类似的东西。 When I print user keys, they show up. 当我打印用户密钥时,它们会出现。

code 2 代码2

But, the problem is when I try to iterate through & print out the contents of user stuff like screen name, location, etc 但是,问题是当我尝试遍历并打印出用户内容的内容时,例如屏幕名称,位置等

code3 CODE3

statuses = []
for tweet in tweets:
    statuses.append(tweet['text']) 

ids = []
for tweet in tweets:
    ids.append(tweet['id_str'])

times = []
for tweet in tweets:
    times.append(tweet['user']['created_at'])

screen_names = []
for tweet in tweets:
    screen_names.append (tweet['user']['screen_name'])

Is it a problem with reading in the json? 读取json是否有问题? Here is a link to what the tweets look like in the file https://drive.google.com/file/d/1u2soBW4PnRTuCMLdKODf4oKl9Zz88D7l/view?usp=sharing Am I just dumb and doing something ridiculously wrong? 这是指向文件https://drive.google.com/file/d/1u2soBW4PnRTuCMLdKODf4oKl9Zz88D7l/view?usp=sharing的推文的链接吗? Thanks so much for helping a newbie. 非常感谢您帮助新手。 I looked around and couldnt find anything about the user stuff not being there. 我环顾四周,找不到关于用户资料的信息。 Sorry if I didnt give all the info needed to help. 抱歉,如果我没有提供帮助所需的所有信息。

Code: 码:

    import json

    import tweepy

    import twitter

    tweets = []

    for line in open('tweets5.json'):

      try: 
        tweets.append(json.loads(line))

    except:
        pass

    print (len(tweets))

    tweet = tweets[2]

    print (tweet)

    print (tweet['entities'])

    print (tweet['user'].keys())

    print (tweet['text'])

    print (tweet['id_str'])

    print (tweet['user']['name'])

    print (tweet['user']['location'])
    print (tweet['place'])
    print(tweet['user']['screen_name'])
    print (tweet['geo'])
    print(tweet['user']['created_at'])

    print (tweet['favorite_count'])
    print (tweet['retweet_count'])


 **//errors start here I get a key error for id_str when I try to do IDs, so I tried moving things around to see if any of them would go through successfully & move on, but I get a key error for whatever is in the brackets first.

statuses = []
for tweet in tweets:
    statuses.append(tweet['text']) 

ids = []
for tweet in tweets:
    ids.append(tweet['id_str'])

times = []
for tweet in tweets:
    times.append(tweet['user']['created_at'])

screen_names = []
for tweet in tweets:
    screen_names.append (tweet['user']['screen_name']) 

This is what the tweet looks like: 这是该推文的样子:

{"created_at":"Mon Feb 19 04:08:20 +0000 2018","id":965437872395472896,"id_str":"965437872395472896","text":"RT @IronStache: .@SpeakerRyan spent all weekend in Florida, but he wouldn\u2019t even visit Parkland to offer comfort and support. \n\nPaul Ryan w\u2026","source":"\u003ca href=\"https:\/\/mobile.twitter.com\" rel=\"nofollow\"\u003eTwitter Lite\u003c\/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":2510706061,"id_str":"2510706061","name":"Bill McClure","screen_name":"RealBillMcClure","location":"Boca Raton, FL","url":null,"description":"Buck & El's kid, husband, dad, friend of the show. Everything I ever needed to know was learned on the streets of Youngstown, OHIO.","translator_type":"none","protected":false,"verified":false,"followers_count":576,"friends_count":1200,"listed_count":5,"favourites_count":2324,"statuses_count":5541,"created_at":"Tue May 20 15:28:42 +0000 2014","utc_offset":null,"time_zone":null,"geo_enabled":true,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"C0DEED","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":false,"profile_link_color":"3B94D9","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/864933232118312960\/kqhxw1LB_normal.jpg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/864933232118312960\/kqhxw1LB_normal.jpg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/2510706061\/1504194378","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"retweeted_status":{"created_at":"Mon Feb 19 00:45:19 +0000 2018","id":965386782471872512,"id_str":"965386782471872512","text":".@SpeakerRyan spent all weekend in Florida, but he wouldn\u2019t even visit Parkland to offer comfort and support. \n\nPau\u2026 https:\/\/t.co\/t2H4QGJ0s7","source":"\u003ca href=\"http:\/\/twitter.com\/download\/iphone\" rel=\"nofollow\"\u003eTwitter for iPhone\u003c\/a\u003e","truncated":true,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":292083207,"id_str":"292083207","name":"Randy Bryce","screen_name":"IronStache","location":"Caledonia, WI","url":"http:\/\/www.randybryceforcongress.com","description":"Democrat for WI-01. Father, Army veteran, Ironworker, Cancer survivor, Wisconsin4ever. Most tweets mine.","translator_type":"none","protected":false,"verified":true,"followers_count":234147,"friends_count":4069,"listed_count":1130,"favourites_count":21552,"statuses_count":26855,"created_at":"Tue May 03 02:27:46 +0000 2011","utc_offset":-21600,"time_zone":"Central Time (US & Canada)","geo_enabled":true,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"C0DEED","profile_background_image_url":"http:\/\/pbs.twimg.com\/profile_background_images\/667813427\/181a97f32cdd68a3fb6a8e744a68a1b7.png","profile_background_image_url_https":"https:\/\/pbs.twimg.com\/profile_background_images\/667813427\/181a97f32cdd68a3fb6a8e744a68a1b7.png","profile_background_tile":true,"profile_link_color":"1B95E0","profile_sidebar_border_color":"000000","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/878715837259153410\/LQ0QNz2y_normal.jpg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/878715837259153410\/LQ0QNz2y_normal.jpg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/292083207\/1497557512","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"extended_tweet":{"full_text":".@SpeakerRyan spent all weekend in Florida, but he wouldn\u2019t even visit Parkland to offer comfort and support. \n\nPaul Ryan won\u2019t do anything to prevent future attacks. We have to replace him with a Congressperson who will. https:\/\/t.co\/PBFoQ6IHfv","display_text_range":[0,245],"entities":{"hashtags":[],"urls":[{"url":"https:\/\/t.co\/PBFoQ6IHfv","expanded_url":"http:\/\/www.palmbeachpost.com\/news\/trump-palm-beach-mar-lago-meeting-with-house-speaker-paul-ryan\/QisKep1LXdV5zQRZWQ5OLO\/","display_url":"palmbeachpost.com\/news\/trump-pal\u2026","indices":[222,245]}],"user_mentions":[{"screen_name":"SpeakerRyan","name":"Paul Ryan","id":18916432,"id_str":"18916432","indices":[1,13]}],"symbols":[]}},"quote_count":191,"reply_count":379,"retweet_count":4375,"favorite_count":11011,"entities":{"hashtags":[],"urls":[{"url":"https:\/\/t.co\/t2H4QGJ0s7","expanded_url":"https:\/\/twitter.com\/i\/web\/status\/965386782471872512","display_url":"twitter.com\/i\/web\/status\/9\u2026","indices":[117,140]}],"user_mentions":[{"screen_name":"SpeakerRyan","name":"Paul Ryan","id":18916432,"id_str":"18916432","indices":[1,13]}],"symbols":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"low","lang":"en"},"is_quote_status":false,"quote_count":0,"reply_count":0,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[],"urls":[],"user_mentions":[{"screen_name":"IronStache","name":"Randy Bryce","id":292083207,"id_str":"292083207","indices":[3,14]},{"screen_name":"SpeakerRyan","name":"Paul Ryan","id":18916432,"id_str":"18916432","indices":[17,29]}],"symbols":[]},"favorited":false,"retweeted":false,"filter_level":"low","lang":"en","timestamp_ms":"1519013300404"}

I think that this line 我认为这条线

print (['user']['screen_name'])

Should be: 应该:

print(tweet['user']['screen_name'])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM