简体   繁体   中英

Python: Converting tuple w list to csv

currently having trouble outputting a tuple with list to list on csv. If the length of this list > 1, for some reason it is converted to a string.

def storePPTrainingData(ppTrainingData,tweetDataFile):
    import csv
    with open(tweetDataFile,'wb') as csvfile:
        linewriter=csv.writer(csvfile,delimiter=',',quotechar="\"")
        for tweet in ppTrainingData:
            try: 
                linewriter.writerow([tweet[0],tweet[1]])
            except Exception,e:
                print e

See ppTrainingData:

ppTrainingData[:1] = [(['bummer', 'got', 'david', 'third', 'day'], 0)]

When outputted to CSV:

"['bummer', 'got', 'david', 'third', 'day']",0

Any pointers would be great, as would like to input list + label into program. ppTrainingData is a list of 20k processed tuples.

In your output csv file, 0 is also a string. That's what a csv is; a text file. When reading one, Python or Pandas or any other framework might try to also convert what it sees into types. In your case 0 might get converted but the list will not. I used pandas for convenience:

>>> tweet = [(['bummer', 'got', 'david', 'third', 'day'], 0)]
>>> df = pd.DataFrame(tweet)
>>> df.to_csv("j.csv")
>>> df = pd.read_csv("j.csv")
>>> df['0'].values[0] # this is just because pandas returns arrays
"['bummer', 'got', 'david', 'third', 'day']" # a string!
>>> lst = eval(df['0'].values[0])
>>> lst, type(lst)
(['bummer', 'got', 'david', 'third', 'day'], <class 'list'>)

When reading you can try using eval or some other method, but you cannot possibly avoid writing strings. You may be able to avoid the quotation marks but it seems a big hassle.

Alternatively, you might consider unnesting the sequence:

>>> tweet[0][0] + [tweet[0][1]] # or something similar
['bummer', 'got', 'david', 'third', 'day', 0]

And then writing that to a csv. When read, you can take everything but the last element to a list, and the last to another, with some tuple unpacking magic:

>>> lst
['bummer', 'got', 'david', 'third', 'day', 0]
>>> *new, zero = lst
>>> new
['bummer', 'got', 'david', 'third', 'day']
>>> zero
0
>>> res = (new, zero)
>>> res
(['bummer', 'got', 'david', 'third', 'day'], 0)

I am not sure what you want to write into the csv file. I would do something like :

for tweet in ppTrainingData:
    # tweet is something like (['bummer', 'got', 'david', 'third', 'day'], 0)
    words, number = tweet
    # words is something like ['bummer', 'got', 'david', 'third', 'day']
    linewriter.writerow(words + [number])
    # we have written 6 columns to csv file : "bummer","got","david","third","day",0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM