简体   繁体   中英

How to save results to csv using python scraper?

I found this python code to scrape twitter by custom search queries:

https://github.com/tomkdickinson/Twitter-Search-API-Python/blob/master/TwitterScraper.py

I want to store the results from this code to a csv file.

I tried adding the csv writer at around line 245 within the for loop that prints out the tweets as per my search query but the csv file results as blank

def save_tweets(self, tweets):
    """
    Just prints out tweets
    :return: True always
    """
    for tweet in tweets:
        # Lets add a counter so we only collect a max number of tweets
        self.counter += 1
        if tweet['created_at'] is not None:
            t = datetime.datetime.fromtimestamp((tweet['created_at']/1000))
            fmt = "%Y-%m-%d %H:%M:%S"
            myCsvRow = log.info("%i [%s] - %s" % (self.counter, t.strftime(fmt), tweet['text']))
            fd = open('document.csv','a')
            fd.write(myCsvRow)
            fd.close()

    return True

Also, There is a comment in the code at around line 170 that mentions:

@abstractmethod
def save_tweets(self, tweets):
    """
    An abstract method that's called with a list of tweets.
    When implementing this class, you can do whatever you want with these tweets.
    """

How can I use this class to save the tweets?

Your problem appears to be the line:

myCsvRow = log.info("%i [%s] - %s" % (self.counter, t.strftime(fmt), tweet['text']))

Looking at the code on the GitHub page you're using, I can see log is a python logger. log.info 's purpose is to write the string that it is given somewhere (ex: the console, a file, or any combination of these or other places). It does not return a value, thus myCsvRow will be empty.

What you want is more likely:

myCsvRow = "%i [%s] - %s" % (self.counter, t.strftime(fmt), tweet['text'])

Although, a couple notes on that:

(1) You are not putting commas between the entries, which is common for CSVs (CSV = Comma Separated Values), and

(2) It's actually kind of risky to try to write out a csv line when one of your fields is a text field that could potentially contain commas. If you naively just write out the text as-is, a comma in the tweet itself would cause whatever is interpreting the CSV to think that there are extra CSV fields in the row. Luckily python comes with a csv library that will help you avoid these kinds of problems.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM