[英]Save and read list of dataframes
I have a list with dataframes (each dataframe has one timeline, alsways starting with 0 and ending differently), which I would like to save as.csv:我有一个包含数据帧的列表(每个 dataframe 都有一个时间线,总是以 0 开头并以不同的方式结束),我想将其另存为.csv:
I want to be able to read the.csv file with its original format as a list of dataframes.我希望能够以原始格式作为数据帧列表读取 .csv 文件。
Since I could not figure out how to save a list with dataframes, I concatinated the list and saved everything as one dataframe: pd.concat(data).to_csv(csvfile)
由于我不知道如何使用数据框保存列表,我将列表连接起来并将所有内容保存为一个 dataframe: pd.concat(data).to_csv(csvfile)
For reading the.csv I tried this: df = pd.read_csv(csvfile)
This will give the location of all zeros zero_indices = list(df.loc[df['Unnamed: 0'] == 0].index)
为了阅读 .csv 我试过这个: df = pd.read_csv(csvfile)
这将给出全零的位置zero_indices = list(df.loc[df['Unnamed: 0'] == 0].index)
Append the number of rows to this to get the last dataframe zero_indices.append(len(df))
Append 行数到此得到最后 dataframe zero_indices.append(len(df))
Get the ranges - tuples of consecutive entries in the above list zero_ranges = [(zero_indices[i], zero_indices[i+1]) for i in range(len(zero_indices) - 1)]
获取范围 - 上述列表中连续条目的元组zero_ranges = [(zero_indices[i], zero_indices[i+1]) for i in range(len(zero_indices) - 1)]
Extract the dataframes into a list X_test = [df.loc[x[0]:x[1] - 1] for x in zero_ranges]
将数据帧提取到列表中X_test = [df.loc[x[0]:x[1] - 1] for x in zero_ranges]
The problem I have is that the index is in the final list with dataframes, but what I actually want is the column "Unnamed: 0" in the final list to be set as the index for each dataframe:我遇到的问题是索引位于带有数据框的最终列表中,但我真正想要的是最终列表中的列“未命名:0”被设置为每个 dataframe 的索引:
I am not entirely sure of how you wanted to approach this, but this is what I understood from your problem statement.我不完全确定你想如何解决这个问题,但这是我从你的问题陈述中理解的。 Let me know if its what you wanted:让我知道它是否是您想要的:
We have two df's:我们有两个df:
>>> ee = {"Unnamed : 0" : [0,1,2,3,4,5,6,7,8],"price" : [43,43,14,6,4,2,6,4,2], "time" : [3,4,5,2,5,6,6,3,4], "hour" : [1,1,1,5,4,3,4,5,4]}
>>> one = pd.DataFame.from_dict(ee)
>>> dd = {"Unnamed : 0" : [0,1,2,3,4,5],"price" : [23,4,32,4,3,234], "time" : [3,2,4,3,2,4], "hour" : [3,4,3,2,4,4]}
>>> two = pd.DataFrame.from_dict(dd)
Which looks like this:看起来像这样:
print(one)
Unnamed : 0 price time hour
0 0 23 3 3
1 1 4 2 4
2 2 32 4 3
3 3 4 3 2
4 4 3 2 4
5 5 234 4 4
print(two)
Unnamed : 0 price time hour
0 0 23 3 3
1 1 4 2 4
2 2 32 4 3
3 3 4 3 2
4 4 3 2 4
5 5 234 4 4
Now combining these two lists, by a list operator:现在通过列表运算符组合这两个列表:
list_dfs = [one,two]
print(list_dfs)
[ Unnamed : 0 price time hour
0 0 43 3 1
1 1 43 4 1
2 2 14 5 1
3 3 6 2 5
4 4 4 5 4
5 5 2 6 3
6 6 6 6 4
7 7 4 3 5
8 8 2 4 4,
Unnamed : 0 price time hour
0 0 23 3 3
1 1 4 2 4
2 2 32 4 3
3 3 4 3 2
4 4 3 2 4
5 5 234 4 4]
Using the DataFrame's function使用 DataFrame 的 function
set_index()设置索引()
list_dfs_index = list(map(lambda x : x.set_index("Unnamed : 0"), list_dfs))
print(list_dfs_index)
[ price time hour
Unnamed : 0
0 43 3 1
1 43 4 1
2 14 5 1
3 6 2 5
4 4 5 4
5 2 6 3
6 6 6 4
7 4 3 5
8 2 4 4,
price time hour
Unnamed : 0
0 23 3 3
1 4 2 4
2 32 4 3
3 4 3 2
4 3 2 4
5 234 4 4]
Alternatively,you can use the same set_index function to set the index as 'Unnamed: 0', before the putting the dataframes into a list.或者,在将数据帧放入列表之前,您可以使用相同的 set_index function 将索引设置为“未命名:0”。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.