简体   繁体   English

Tweepy。 将推文文本存储在python pandas dataframe中

[英]Tweepy. storing tweet text in python pandas dataframe

I am following an online tutorial ( http://adilmoujahid.com/posts/2014/07/twitter-analytics/ ) and I am getting stuck despite writing the python script the same. 我正在关注在线教程( http://adilmoujahid.com/posts/2014/07/twitter-analytics/ ),尽管编写了相同的python脚本,但我还是被卡住了。 I am not really proficient in python and am having a hard time understanding documentation on maps (which are used in the tutorial). 我不太精通python,并且很难理解地图上的文档(本教程中使用过)。 Right now I am getting "valueError Cannot set a frame with no defined index and a value that cannot be converted to a Series" and cannot figure out a fix. 现在,我收到“ valueError无法设置没有定义索引和无法转换为Series的值的框架”并且无法找出解决办法。 I am under the impression that the dataframe will have 3 columns. 我的印象是该数据框将具有3列。 One with all the tweets, one with the tweets that mention facebook and one with all the tweets that mention microsoft. 一种带有所有推文,一种带有提及Facebook的推文,一种带有所有提及Microsoft的推文。 I also realize that the tutorial is two years old so maybe there is some syntax that is deprecated? 我还意识到该教程已有两年历史,因此也许不赞成使用某些语法? Any help appreciated 任何帮助表示赞赏

import json 
import pandas as pd 
import re 

tweets_data_path = "Desktop/twit_dat/tweet1.txt"
tweets_data = []

tweets_file = open(tweets_data_path, "r")
for line in tweets_file:
    try:
        tweet = json.loads(line)
        tweets_data.append(tweet) 
    except:
        continue


tweets = pd.DataFrame()


tweets['text'] = map(lambda tweet: tweet['text'], tweets_data)
tweets['Facebook'] = tweets['text'].apply(lambda tweet: word_in_text('Facebook', tweet))
tweets['Microsoft'] = tweets['text'].apply(lambda tweet: word_in_text('Microsoft', tweet))



def word_in_text(word,text):
     if text == None:
        return False
     word = word.lower()
     text = text.lower() 
     match = re.search(word,text)
     if match:
        return True
     else:
        return False

Here is a sample of the data I am using: http://charon.kean.edu/~jonathan/exampledata.txt 这是我使用的数据示例: http : //charon.kean.edu/~jonathan/exampledata.txt

Maybe your pandas version is lower. 也许您的熊猫版本较低。 I replicate the code and works fine on my compiler. 我复制代码并在编译器上正常工作。 See if this is helpful. 看看这是否有帮助。
https://github.com/pandas-dev/pandas/issues/5632 https://github.com/pandas-dev/pandas/issues/5632
--this is more of a comment but i don't have that privilege--. -更多是评论,但我没有特权-。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM