Python將文件保存到csv

Question

我在Twitter推文中有以下代碼，應該處理數據，然后將其保存到新文件中。

這是代碼：

#import regex
import re

#start process_tweet
def processTweet(tweet):
    # process the tweets

    #Convert to lower case
    tweet = tweet.lower()
    #Convert www.* or https?://* to URL
    tweet = re.sub('((www\.[\s]+)|(https?://[^\s]+))','URL',tweet)
    #Convert @username to AT_USER
    tweet = re.sub('@[^\s]+','AT_USER',tweet)
    #Remove additional white spaces
    tweet = re.sub('[\s]+', ' ', tweet)
    #Replace #word with word
    tweet = re.sub(r'#([^\s]+)', r'\1', tweet)
    #trim
    tweet = tweet.strip('\'"')
    return tweet
#end

#Read the tweets one by one and process it
input = open('withoutEmptylines.csv', 'rb')
output = open('editedTweets.csv','wb')

line = input.readline()

while line:
    processedTweet = processTweet(line)
    print (processedTweet)
    output.write(processedTweet)
    line = input.readline()

input.close()
output.close()

我在輸入文件中的數據如下所示，因此每條推文都在一行中：

She wants to ride my BMW the go for a ride in my BMW lol http://t.co/FeoNg48AQZ
BMW Sees U.S. As Top Market For 2015 i8 http://t.co/kkFyiBDcaP

我的函數運行良好，但是我對如下所示的輸出不滿意：

she wants to ride my bmw the go for a ride in my bmw lol URL rt AT_USER Ðun bmw es mucho? yo: bmw. -AT_USER veeergaaa!. hahahahahahahahaha nos hiciste la noche caray!

因此，它會將所有內容都排在一行中，而不是將每條推文都排在一行中，就像輸入文件中的格式一樣。

是否有人想將每條推文排成一行？

Answer 1

帶有這樣的示例文件：

tweet number one
tweet number two
tweet number three

這段代碼：

file = open('tweets.txt')
for line in file:
   print line

產生以下輸出：

tweet number one

tweet number two

tweet number three

Python可以很好地讀取結尾處的內容，但是您的腳本正在通過正則表達式替換來替換它們。

此正則表達式替代：

tweet = re.sub('[\s]+', ' ', tweet)

正在將所有空白字符（例如，制表符和換行符）轉換為單個空格。

在輸出前，在tweet上添加結尾行，或者修改正則表達式以不替換結尾行，如下所示：

tweet = re.sub('[ ]+', ' ', tweet)

編輯：我把我的測試替換命令放在那里。 該建議已得到解決。

Python將文件保存到csv

問題描述

1 個解決方案

解決方案1
0 2013-09-15 14:22:35

Python將文件保存到csv

問題描述

1 個解決方案

解決方案1 0 2013-09-15 14:22:35

解決方案1
0 2013-09-15 14:22:35