简体   繁体   English

保存和读取数据框列表

[英]Save and read list of dataframes

I have a list with dataframes (each dataframe has one timeline, alsways starting with 0 and ending differently), which I would like to save as.csv:我有一个包含数据帧的列表(每个 dataframe 都有一个时间线,总是以 0 开头并以不同的方式结束),我想将其另存为.csv: 在此处输入图像描述

I want to be able to read the.csv file with its original format as a list of dataframes.我希望能够以原始格式作为数据帧列表读取 .csv 文件。

Since I could not figure out how to save a list with dataframes, I concatinated the list and saved everything as one dataframe: pd.concat(data).to_csv(csvfile)由于我不知道如何使用数据框保存列表,我将列表连接起来并将所有内容保存为一个 dataframe: pd.concat(data).to_csv(csvfile)

For reading the.csv I tried this: df = pd.read_csv(csvfile) This will give the location of all zeros zero_indices = list(df.loc[df['Unnamed: 0'] == 0].index)为了阅读 .csv 我试过这个: df = pd.read_csv(csvfile)这将给出全零的位置zero_indices = list(df.loc[df['Unnamed: 0'] == 0].index)

Append the number of rows to this to get the last dataframe zero_indices.append(len(df)) Append 行数到此得到最后 dataframe zero_indices.append(len(df))

Get the ranges - tuples of consecutive entries in the above list zero_ranges = [(zero_indices[i], zero_indices[i+1]) for i in range(len(zero_indices) - 1)]获取范围 - 上述列表中连续条目的元组zero_ranges = [(zero_indices[i], zero_indices[i+1]) for i in range(len(zero_indices) - 1)]

Extract the dataframes into a list X_test = [df.loc[x[0]:x[1] - 1] for x in zero_ranges]将数据帧提取到列表中X_test = [df.loc[x[0]:x[1] - 1] for x in zero_ranges]

The problem I have is that the index is in the final list with dataframes, but what I actually want is the column "Unnamed: 0" in the final list to be set as the index for each dataframe:我遇到的问题是索引位于带有数据框的最终列表中,但我真正想要的是最终列表中的列“未命名:0”被设置为每个 dataframe 的索引: 在此处输入图像描述

I am not entirely sure of how you wanted to approach this, but this is what I understood from your problem statement.我不完全确定你想如何解决这个问题,但这是我从你的问题陈述中理解的。 Let me know if its what you wanted:让我知道它是否是您想要的:

We have two df's:我们有两个df:

>>> ee = {"Unnamed : 0" : [0,1,2,3,4,5,6,7,8],"price" : [43,43,14,6,4,2,6,4,2], "time" : [3,4,5,2,5,6,6,3,4], "hour" : [1,1,1,5,4,3,4,5,4]}
>>> one = pd.DataFame.from_dict(ee)
>>> dd = {"Unnamed : 0" : [0,1,2,3,4,5],"price" : [23,4,32,4,3,234], "time" : [3,2,4,3,2,4], "hour" : [3,4,3,2,4,4]}
>>> two = pd.DataFrame.from_dict(dd)

Which looks like this:看起来像这样:

print(one)
       Unnamed : 0  price  time  hour
    0            0     23     3     3
    1            1      4     2     4
    2            2     32     4     3
    3            3      4     3     2
    4            4      3     2     4
    5            5    234     4     4

print(two)
         Unnamed : 0  price  time  hour
      0            0     23     3     3
      1            1      4     2     4
      2            2     32     4     3
      3            3      4     3     2
      4            4      3     2     4
      5            5    234     4     4

Now combining these two lists, by a list operator:现在通过列表运算符组合这两个列表:

list_dfs = [one,two]
print(list_dfs)

[        Unnamed : 0  price  time  hour
     0            0     43     3     1
     1            1     43     4     1
     2            2     14     5     1
     3            3      6     2     5
     4            4      4     5     4
     5            5      2     6     3
     6            6      6     6     4
     7            7      4     3     5
     8            8      2     4     4,    
        Unnamed : 0  price  time  hour
     0            0     23     3     3
     1            1      4     2     4
     2            2     32     4     3
     3            3      4     3     2
     4            4      3     2     4
     5            5    234     4     4]

Using the DataFrame's function使用 DataFrame 的 function

set_index()设置索引()

list_dfs_index = list(map(lambda x : x.set_index("Unnamed : 0"), list_dfs))
print(list_dfs_index)

[                price  time  hour
 Unnamed : 0
    0               43     3     1
    1               43     4     1
    2               14     5     1
    3                6     2     5
    4                4     5     4
    5                2     6     3
    6                6     6     4
    7                4     3     5
    8                2     4     4,              
                 price  time  hour
 Unnamed : 0
    0               23     3     3
    1                4     2     4
    2               32     4     3
    3                4     3     2
    4                3     2     4
    5              234     4     4]

Alternatively,you can use the same set_index function to set the index as 'Unnamed: 0', before the putting the dataframes into a list.或者,在将数据帧放入列表之前,您可以使用相同的 set_index function 将索引设置为“未命名:0”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM