简体   繁体   English

Tweepy - 使用推文字符串填充 Pandas 数据框列时出现错误 144

[英]Tweepy - Error 144 when populating a pandas dataframe column with tweet strings

I'm populating some rows in a dataframe using the twitter ID.我正在使用 twitter ID 在数据框中填充一些行。 I've run the script the first time without the except and I had the error: [{'code': 144, 'message': 'No status found with that ID.'}] I understand that it might be because someone deleted the tweet or for other reason.我第一次在没有except的情况下运行脚本,出现错误: [{'code': 144, 'message': 'No status found with that ID.'}]我知道这可能是因为有人删除了推文或其他原因。 However, I need to keep going!然而,我需要继续前进!

So I used the except: pass , but it actually doesn't return anything.所以我使用了except: pass ,但它实际上没有返回任何东西。 All the rows are empty.所有的行都是空的。 I've been working hard on this, but I don't know to solve it.我一直在努力解决这个问题,但我不知道如何解决它。

My dataframe:我的数据框:

          TweetID                text               pageType
index   
id1                     My code is not working      http://blablabla.com
id2     451864165416    Nan                         twitter
id3     849849849844    Nan                         twitter

Here is the code that doesn't return anything:这是不返回任何内容的代码:

try:
    if (df['pageType'] == 'twitter').any:
        df['text'] = df.tweetID.apply(lambda x: api.get_status(x).text)
except:
    pass

That's it!就是这样! Thanks a lot!非常感谢!

I'd recommend a boolean index + loc + apply :我建议使用boolean index + loc + apply

mask = df['pageType'] == 'twitter'
df.loc[mask, 'text'] = df.loc[mask, 'twitterID']\
                           .apply(lambda x: api.get_status(x).text)

Problem is, your try and except setup stops execution before the apply can be completed, which in turn never creates the new column.问题是,您的tryexcept设置在apply完成之前停止执行,这反过来又不会创建新列。 Typically you would place this clause in a for-loop , the way you are using it.通常,您会将这个子句放在for-loop ,就像您使用它的方式一样。 Instead you could create a custom function, so that it catches errors on tweetID values that are invalid.相反,您可以创建一个自定义函数,以便它捕获无效的tweetID值的错误。

def GetStuff(value):
    try:
        return api.get_status(value).text
    except:
        return "ERROR"

df['text'] = df.tweetID.apply(lambda x: GetStuff(x))

To meet the conditions in the comments:要满足评论中的条件:

Option 1选项1

def GetStuff(value):
    try:
        return api.get_status(value).text
    except:
        return "ERROR"

df['text'] = df.where(df.tweetID == 'twitter').tweetID.apply(lambda x: GetStuff(x))

Which applies the function where tweetID == twitter , the the other values are NaN with you can replace with some other text using fillna()其中应用了tweetID == twitter的函数,其他值为NaN ,您可以使用fillna()替换为其他一些文本

Option 2选项 2

Build conditions in the GetStuff() function.GetStuff()函数中构建条件。

def GetStuff(value):
    if value == 'twitter':
        try:
            return api.get_status(value).text
        except:
            return "ERROR"
     else:
         return 'NotTwitter'

df['text'] = df.tweetID.apply(lambda x: GetStuff(x))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM