简体   繁体   English

流实时推文时出现JSONDecodeError

[英]JSONDecodeError while streaming live tweets

I am using Twython to get live streaming tweets then append them to a json file. 我正在使用Twython获取实时流推文,然后将它们附加到json文件中。 Since the tweets are sequently dumped into the json file, I am trying to separate between every object and format the file to avoid a Multiple JSON root elements error . 由于这些推文随后被转储到json文件中,因此我尝试在每个对象之间进行分隔并格式化该文件,以避免出现JSON多个根元素错误 To do so I am using the following code: 为此,我使用以下代码:

class MyStreamer(TwythonStreamer):
def on_success(self,data):

      with open('fetched_tweets.json','a') as tf:
            json.dump(data, tf)
            tf.write("\n") 
            time.sleep(10)  

            content = open('fetched_tweets.json', "r").read() 
            n = [json.loads(str(item)) for item in content.strip().split('\n')]
            with open ('test.json', 'w') as m:
                json.dump(n,m)
            return True

def on_error(self, status):
    print (status)  

My problem: The code is giving me the following error : JSONDecodeError: Expecting value: line 1 column 1 (char 0) 我的问题:该代码给我以下错误: JSONDecodeError:期望值:第1行第1列(字符0)

What I don't understand: If I run the following code separately it works fine (reformats the json data dumped into the file 'fetched_tweets.json' to a valid json file 'test.json') but gives me the error when I add it to the main script: 我不明白的是:如果我分别运行以下代码,则可以正常工作(将转储到文件“ fetched_tweets.json”中的json数据重新格式化为有效的json文件“ test.json”),但添加时却出现了错误它到主脚本:

content = open('fetched_tweets.json', "r").read() 
            n = [json.loads(str(item)) for item in content.strip().split('\n')]
            with open ('test.json', 'w') as m:
                json.dump(n,m)

What I need: I need to run everything in the same script without giving me any errors. 我需要什么:我需要在同一脚本中运行所有程序,而不会给我任何错误。

NB: I am using Jupyter notebook. 注意:我正在使用Jupyter笔记本。

EDIT : The data in the file 'fetched_tweets.json' looks like the API JSON response in this link: https://gist.github.com/hrp/900964 . 编辑 :文件“ fetched_tweets.json”中的数据看起来像此链接中的API JSON响应: https ://gist.github.com/hrp/900964。 I am using the code below to write every tweet in a single line: 我正在使用下面的代码在一行中编写每个推文:

tf.write("\n")

Then using: 然后使用:

content = open('fetched_tweets.json', "r").read() 
        n = [json.loads(str(item)) for item in content.strip().split('\n')]
        with open ('test.json', 'w') as m:
            json.dump(n,m)

To reformat the file to a valid JSON file. 将文件重新格式化为有效的JSON文件。

Data Sample from 'fetched_tweets.json' file before applying content.strip() code : 应用content.strip()代码之前,来自“ fetched_tweets.json”文件的数据样本:

{"created_at": "Mon Dec 17 22:38:45 +0000 2018", "id": 1074796067898748929, "id_str": "1074796067898748929", "text": "RT @robreiner: It\u2019s clear. The President is a criminal. He has committed felonies. In the United States of America no one is above the law.\u2026", "source": "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>", "truncated": false, "in_reply_to_status_id": null, "in_reply_to_status_id_str": null, "in_reply_to_user_id": null, "in_reply_to_user_id_str": null, "in_reply_to_screen_name": null, "user": {"id": 817409784777506816, "id_str": "817409784777506816", "name": "UNrealDonaldTrump", "screen_name": "UNreal_Donald_T", "location": "Europe", "url": null, "description": "\ud83c\uddec\ud83c\udde7Old Fernebergian. Follows NASCAR & NFL. Don't make me laugh; it hurts my back\ud83d\ude00. Ex-adman MIPA(retd). Thank gawd for KODI & a VPN.", "translator_type": "none", "protected": false, "verified": false, "followers_count": 367, "friends_count": 103, "listed_count": 12, "favourites_count": 65139, "statuses_count": 51622, "created_at": "Fri Jan 06 16:37:34 +0000 2017", "utc_offset": null, "time_zone": null, "geo_enabled": false, "lang": "en", "contributors_enabled": false, "is_translator": false, "profile_background_color": "000000", "profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png", "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png", "profile_background_tile": false, "profile_link_color": "000000", "profile_sidebar_border_color": "000000", "profile_sidebar_fill_color": "000000", "profile_text_color": "000000", "profile_use_background_image": false, "profile_image_url": "http://pbs.twimg.com/profile_images/1011365968805875712/Sdu90pe9_normal.jpg", "profile_image_url_https": "https://pbs.twimg.com/profile_images/1011365968805875712/Sdu90pe9_normal.jpg", "profile_banner_url": "https://pbs.twimg.com/profile_banners/817409784777506816/1494141668", "default_profile": false, "default_profile_image": false, "following": null, "follow_request_sent": null, "notifications": null}, "geo": null, "coordinates": null, "place": null, "contributors": null, "retweeted_status": {"created_at": "Mon Dec 17 15:21:01 +0000 2018", "id": 1074685908496961537, "id_str": "1074685908496961537", "text": "It\u2019s clear. The President is a criminal. He has committed felonies. In the United States of America no one is above\u2026 https://t.co/", "source": "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>", "truncated": true, "in_reply_to_status_id": null, "in_reply_to_status_id_str": null, "in_reply_to_user_id": null, "in_reply_to_user_id_str": null, "in_reply_to_screen_name": null, "user": {"id": 738080573365702657, "id_str": "738080573365702657", "name": "Rob Reiner", "screen_name": "robreiner", "location": "California, USA", "url": null, "description": "Filmmaker, actor, producer, husband, and father.", "translator_type": "none", "protected": false, "verified": true, "followers_count": 539787, "friends_count": 277, "listed_count": 2709, "favourites_count": 67957, "statuses_count": 2313, "created_at": "Wed Jun 01 18:51:36 +0000 2016", "utc_offset": null, "time_zone": null, "geo_enabled": false, "lang": "en", "contributors_enabled": false, "is_translator": false, "profile_background_color": "F5F8FA", "profile_background_image_url": "", "profile_background_image_url_https": "", "profile_background_tile": false, "profile_link_color": "1DA1F2", "profile_sidebar_border_color": "C0DEED", "profile_sidebar_fill_color": "DDEEF6", "profile_text_color": "333333", "profile_use_background_image": true, "profile_image_url": "http://pbs.twimg.com/profile_images/740361916883730432/B44FKZvz_normal.jpg", "profile_image_url_https": "https://pbs.twimg.com/profile_images/740361916883730432/B44FKZvz_normal.jpg", "profile_banner_url": "https://pbs.twimg.com/profile_banners/738080573365702657/1517362906", "default_profile": true, "default_profile_image": false, "following": null, "follow_request_sent": null, "notifications": null}, "geo": null, "coordinates": null, "place": null, "contributors": null, "is_quote_status": false, "extended_tweet": {"full_text": "It\u2019s clear. The President is a criminal. He has committed felonies. In the United States of America no one is above the law. There is nothing in the Constitution that says a President can\u2019t be indicted. Donald Trump must be indicted.", "display_text_range": [0, 233], "entities": {"hashtags": [], "urls": [], "user_mentions": [], "symbols": []}}, "quote_count": 312, "reply_count": 776, "retweet_count": 7790, "favorite_count": 29009, "entities": {"hashtags": [], "urls": [{"url": "https://t.co", "expanded_url": "https://twitter.com/i/web/status/1074685908496961537", "display_url": "twitter.com/i/web/status/1\u2026", "indices": [117, 140]}], "user_mentions": [], "symbols": []}, "favorited": false, "retweeted": false, "filter_level": "low", "lang": "en"}, "is_quote_status": false, "quote_count": 0, "reply_count": 0, "retweet_count": 0, "favorite_count": 0, "entities": {"hashtags": [], "urls": [], "user_mentions": [{"screen_name": "robreiner", "name": "Rob Reiner", "id": 738080573365702657, "id_str": "738080573365702657", "indices": [3, 13]}], "symbols": []}, "favorited": false, "retweeted": false, "filter_level": "low", "lang": "en", "timestamp_ms": "1545086325989"}
{"created_at": "Mon Dec 17 22:38:46 +0000 2018", "id": 1074796068015992832, "id_str": "1074796068015992832", "text": "RT @EdKrassen: BREAKING:  Donald Trump has won the \"Golden Idiot\" award from the Heute-Show, a late-night satirical German television progr\u2026", "source": "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>", "truncated": false, "in_reply_to_status_id": null, "in_reply_to_status_id_str": null, "in_reply_to_user_id": null, "in_reply_to_user_id_str": null, "in_reply_to_screen_name": null, "user": {"id": 23458866, "id_str": "23458866", "name": "Prudence Cain", "screen_name": "iggasuz", "location": "Colorado", "url": null, "description": "Passionate about reading.  Folks should try it.  If salty language offends you, you might not want to follow.  Retired medical professional.  Vet.", "translator_type": "none", "protected": false, "verified": false, "followers_count": 2030, "friends_count": 4328, "listed_count": 9, "favourites_count": 77824, "statuses_count": 38918, "created_at": "Mon Mar 09 16:56:40 +0000 2009", "utc_offset": null, "time_zone": null, "geo_enabled": true, "lang": "en", "contributors_enabled": false, "is_translator": false, "profile_background_color": "642D8B", "profile_background_image_url": "http://abs.twimg.com/images/themes/theme10/bg.gif", "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme10/bg.gif", "profile_background_tile": true, "profile_link_color": "1B95E0", "profile_sidebar_border_color": "DA65AD", "profile_sidebar_fill_color": "7AC3EE", "profile_text_color": "3D1957", "profile_use_background_image": true, "profile_image_url": "http://pbs.twimg.com/profile_images/863044678832111616/E8oRd-1l_normal.jpg", "profile_image_url_https": "https://pbs.twimg.com/profile_images/863044678832111616/E8oRd-1l_normal.jpg", "profile_banner_url": "https://pbs.twimg.com/profile_banners/23458866/1541786436", "default_profile": false, "default_profile_image": false, "following": null, "follow_request_sent": null, "notifications": null}, "geo": null, "coordinates": null, "place": null, "contributors": null, "retweeted_status": {"created_at": "Mon Dec 17 22:20:00 +0000 2018", "id": 1074791346521681920, "id_str": "1074791346521681920", "text": "BREAKING:  Donald Trump has won the \"Golden Idiot\" award from the Heute-Show, a late-night satirical German televis\u2026 https://t.co", "source": "<a href=\"https://about.twitter.com/products/tweetdeck\" rel=\"nofollow\">TweetDeck</a>", "truncated": true, "in_reply_to_status_id": null, "in_reply_to_status_id_str": null, "in_reply_to_user_id": null, "in_reply_to_user_id_str": null, "in_reply_to_screen_name": null, "user": {"id": 132339474, "id_str": "132339474", "name": "Ed Krassenstein", "screen_name": "EdKrassen", "location": "Fort Myers, FL", "url": "http://edkrassenstein.com", "description": "Co-founder of @HillReporter, Author \"How the People Trumped Ronald Plump\", ed@hillreporter.com, edkrassen@protonmail.com - Twin of @Krassenstein", "translator_type": "none", "protected": false, "verified": false, "followers_count": 851760, "friends_count": 659436, "listed_count": 6654, "favourites_count": 34890, "statuses_count": 40251, "created_at": "Tue Apr 13 00:00:13 +0000 2010", "utc_offset": null, "time_zone": null, "geo_enabled": false, "lang": "en", "contributors_enabled": false, "is_translator": false, "profile_background_color": "000000", "profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png", "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png", "profile_background_tile": false, "profile_link_color": "229900", "profile_sidebar_border_color": "000000", "profile_sidebar_fill_color": "000000", "profile_text_color": "000000", "profile_use_background_image": false, "profile_image_url": "http://pbs.twimg.com/profile_images/1032612349633486848/t35esAW6_normal.jpg", "profile_image_url_https": "https://pbs.twimg.com/profile_images/1032612349633486848/t35esAW6_normal.jpg", "profile_banner_url": "https://pbs.twimg.com/profile_banners/132339474/1537441493", "default_profile": false, "default_profile_image": false, "following": null, "follow_request_sent": null, "notifications": null}, "geo": null, "coordinates": null, "place": null, "contributors": null, "is_quote_status": false, "extended_tweet": {"full_text": "BREAKING:  Donald Trump has won the \"Golden Idiot\" award from the Heute-Show, a late-night satirical German television program, for the 4th year in a row.\n\nCongratulations Mr. Trump.  You are the mockery of the world, again, and again, and again, and again!", "display_text_range": [0, 257], "entities": {"hashtags": [], "urls": [], "user_mentions": [], "symbols": []}}, "quote_count": 55, "reply_count": 77, "retweet_count": 455, "favorite_count": 1724, "entities": {"hashtags": [], "urls": [{"url": "https://t.co", "expanded_url": "https://twitter.com/i/web/status/1074791346521681920", "display_url": "twitter.com/i/web/status/1\u2026", "indices": [117, 140]}], "user_mentions": [], "symbols": []}, "favorited": false, "retweeted": false, "filter_level": "low", "lang": "en"}, "is_quote_status": false, "quote_count": 0, "reply_count": 0, "retweet_count": 0, "favorite_count": 0, "entities": {"hashtags": [], "urls": [], "user_mentions": [{"screen_name": "EdKrassen", "name": "Ed Krassenstein", "id": 132339474, "id_str": "132339474", "indices": [3, 13]}], "symbols": []}, "favorited": false, "retweeted": false, "filter_level": "low", "lang": "en", "timestamp_ms": "1545086326017"}

I was able to modify the script like this: 我能够像这样修改脚本:

class MyStreamer(TwythonStreamer):
def on_success(self, data):
    if 'text' in data:

        print(data['text'])
        def on_error(self, status_code, data):
    print(status_code)
    class MyStreamer(TwythonStreamer):
def on_success(self, data):        
     with open('fetched_tweets.json','a') as tf:
        json.dump(data, tf)
        tf.write("\n") 


    contents = open('fetched_tweets.json', "r").read() 
    data = [json.loads(str(item)) for item in contents.strip().split('\n')]
    with open ('test.json', 'w') as m:
        json.dump(data,m) 

        return True
def on_error(self, status):
    print (status)
stream = MyStreamer(APP_KEY, APP_SECRET,
                OAUTH_TOKEN, OAUTH_TOKEN_SECRET)
tweet = stream.statuses.filter(track='Keyword')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM