简体   繁体   English

将列表写入pandas dataframe到csv,从csv读取dataframe并再次转换为列表而没有字符串

[英]write lists to pandas dataframe to csv, read dataframe from csv and convert to lists again without having strings

Originally I had a list of list and each list contains tuples of strings (from some computations). 最初,我有一个列表列表,每个列表都包含字符串元组(来自某些计算)。 I want to save them for later, so I don't have to do all the computations again and just read the csv. 我想保存它们供以后使用,所以我不必再次进行所有计算,而只需阅读csv。

 L = [l1,l2,...]
 l1 = [('a','b'), ('c','d'),...]
 l2 = [('e','f'), ('g','h'),...]...

I converted it to a pandas data frame: 我将其转换为熊猫数据框:

 import pandas as pd
 df = pd.DataFrame(L)
 df.to_csv('MyLists.csv', sep=";")

So each list l is saved as a row in the csv. 因此,每个列表l在csv中都保存为一行。 Some time later I want to use the list saved in the csv again. 一段时间后,我想再次使用保存在csv中的列表。 So I imported pandas again and did: 所以我再次进口了大熊猫,然后做了:

readdf = pd.read_csv('MyLists.csv', delimiter = ";")
newList = readdf.values.tolist()

The problem is that every tuple is a string itself now, ie every list in newList looks as follows: 问题是现在每个元组本身就是一个字符串,即newList中的每个列表如下所示:

l1 = ['('a','b')', '('c', 'd')',...]

When I look at the csv with a text editor, it looks correct, somehow like: 当我使用文本编辑器查看csv时,它看起来是正确的,就像这样:

('a','b');('c','d');... 

I tried to read it directly with: 我尝试直接通过以下方式阅读它:

import csv

newList = []
with open('MyLists.csv') as f:    
    reader = csv.reader(f, delimiter=";")
    for row in reader:
        newList.append(row)

But the problem is the same. 但是问题是一样的。 So how can I get rid of the extra " ' "? 那么如何摆脱多余的“”呢?

I think you need convert string s to tuples , because data in csv are string s: 我认为您需要将string转换为tuples ,因为csv中的数据是string

import ast

l1 = [('a','b'), ('c','d')]
l2 = [('e','f'), ('g','h')]
L = [l1,l2]

df = pd.DataFrame(L)
print (df)
        0       1
0  (a, b)  (c, d)
1  (e, f)  (g, h)

df.to_csv('MyLists.csv', sep=";")

readdf = pd.read_csv('MyLists.csv', delimiter = ";", index_col=0)
newList = readdf.applymap(ast.literal_eval).values.tolist()
print (newList)
[[('a', 'b'), ('c', 'd')], [('e', 'f'), ('g', 'h')]]

But I think better is use pickle for save your data - use to_pickle / read_pickle : 但我认为更好的方法是使用pickle保存数据-使用to_pickle / read_pickle

df.to_pickle('MyLists.pkl')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM