currently having trouble outputting a tuple with list to list on csv. If the length of this list > 1, for some reason it is converted to a string.
def storePPTrainingData(ppTrainingData,tweetDataFile):
import csv
with open(tweetDataFile,'wb') as csvfile:
linewriter=csv.writer(csvfile,delimiter=',',quotechar="\"")
for tweet in ppTrainingData:
try:
linewriter.writerow([tweet[0],tweet[1]])
except Exception,e:
print e
See ppTrainingData:
ppTrainingData[:1] = [(['bummer', 'got', 'david', 'third', 'day'], 0)]
When outputted to CSV:
"['bummer', 'got', 'david', 'third', 'day']",0
Any pointers would be great, as would like to input list + label into program. ppTrainingData is a list of 20k processed tuples.
In your output csv file, 0 is also a string. That's what a csv is; a text file. When reading one, Python or Pandas or any other framework might try to also convert what it sees into types. In your case 0 might get converted but the list will not. I used pandas for convenience:
>>> tweet = [(['bummer', 'got', 'david', 'third', 'day'], 0)]
>>> df = pd.DataFrame(tweet)
>>> df.to_csv("j.csv")
>>> df = pd.read_csv("j.csv")
>>> df['0'].values[0] # this is just because pandas returns arrays
"['bummer', 'got', 'david', 'third', 'day']" # a string!
>>> lst = eval(df['0'].values[0])
>>> lst, type(lst)
(['bummer', 'got', 'david', 'third', 'day'], <class 'list'>)
When reading you can try using eval
or some other method, but you cannot possibly avoid writing strings. You may be able to avoid the quotation marks but it seems a big hassle.
Alternatively, you might consider unnesting the sequence:
>>> tweet[0][0] + [tweet[0][1]] # or something similar
['bummer', 'got', 'david', 'third', 'day', 0]
And then writing that to a csv. When read, you can take everything but the last element to a list, and the last to another, with some tuple unpacking magic:
>>> lst
['bummer', 'got', 'david', 'third', 'day', 0]
>>> *new, zero = lst
>>> new
['bummer', 'got', 'david', 'third', 'day']
>>> zero
0
>>> res = (new, zero)
>>> res
(['bummer', 'got', 'david', 'third', 'day'], 0)
I am not sure what you want to write into the csv file. I would do something like :
for tweet in ppTrainingData:
# tweet is something like (['bummer', 'got', 'david', 'third', 'day'], 0)
words, number = tweet
# words is something like ['bummer', 'got', 'david', 'third', 'day']
linewriter.writerow(words + [number])
# we have written 6 columns to csv file : "bummer","got","david","third","day",0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.