简体   繁体   English

汇总来自不同数据框的数据框列

[英]Summing data-frame columns from different data-frames

I have a number of timeseries .csv files that I am reading into a dataframe (df).我有许多时间序列 .csv 文件,我正在将它们读入数据框 (df)。 I would like to create another data-frame that has the sum of all of these dataframes added together.我想创建另一个数据帧,它将所有这些数据帧的总和加在一起。

Examples of the dataframes are as follows.数据帧的示例如下。 Example df 1:示例 df 1:

         date BBG.XASX.ABP.S_price BBG.XASX.ABP.S_pos BBG.XASX.ABP.S_trade \ 
0  2017-09-11            2.8303586                0.0                  0.0   
1  2017-09-12            2.8135189                0.0                  0.0   
2  2017-09-13            2.7829274            86614.0              86614.0   
3  2017-09-14            2.7928042            86614.0                  0.0   
4  2017-09-15            2.8120383            86614.0                  0.0   

  BBG.XASX.ABP.S_cost BBG.XASX.ABP.S_pnl_pre_cost 
0                -0.0                         0.0   
1                -0.0                         0.0    
2    -32.540463966186                         0.0   
3                -0.0           855.4691551999713             
4                -0.0           1665.942337400047  

example df2:示例 df2:

        date BBG.XASX.AHG.S_price BBG.XASX.AHG.S_pos BBG.XASX.AHG.S_trade  \
0  2017-09-11            2.6068676                0.0                  0.0   
1  2017-09-12            2.6044785            76439.0              76439.0   
2  2017-09-13   2.6024171000000003            76439.0                  0.0   
3  2017-09-14            2.6139929            76439.0                  0.0   
4  2017-09-15            2.6602836            76439.0                  0.0   

   BBG.XASX.AHG.S_cost BBG.XASX.AHG.S_pnl_pre_cost 
0                 -0.0                         0.0   
1  -26.876303828302497                         0.0   
2                 -0.0          -157.5713545999606   
3                 -0.0           884.8425761999679   
4                 -0.0           3538.414817300014  

example df 3:示例 df 3:

  date BBG.XASX.AGL.S_price BBG.XASX.AGL.S_pos BBG.XASX.AGL.S_trade  \
0  2017-09-18           18.8195983                0.0                  0.0   
1  2017-09-19           18.5104704            40613.0              40613.0   
2  2017-09-20           18.2010515            40613.0                  0.0   
3  2017-09-21           18.2217768            40613.0                  0.0   
4  2017-09-22            17.840112            40613.0                  0.0   

  BBG.XASX.AGL.S_cost BBG.XASX.AGL.S_pnl_pre_cost 
0                -0.0                         0.0                          
1   -101.488374137952                         0.0    
2                -0.0          -12566.42978570005   
3                -0.0           841.7166089001112    
4                -0.0         -15500.552522399928

Adding together the example dataframes the code would return the following output:将示例数据帧加在一起,代码将返回以下输出:

output:输出:

date                 1       2      3              4               5               6
11/09/2017   5.4372262       0      0              0               0               0
12/09/2017   5.4179974   76439  76439              2    -26.87630383               0
13/09/2017   5.3853445  163053  86614              4    -32.54046397    -157.5713546
14/09/2017   5.4067971  163053      0              6               0     1740.311731
15/09/2017   5.4723219  163053      0              8               0     5204.357155
18/09/2017  18.8195983       0      0              0               0               0
19/09/2017  18.5104704   40613  40613   -101.4883741               0               0
20/09/2017  18.2010515   40613      0              0    -12566.42979               0
21/09/2017  18.2217768   40613      0              0     841.7166089               0
22/09/2017   17.840112   40613      0              0    -15500.55252               0

All of the data-frames have the same number of columns in the same order.所有数据帧都具有相同顺序的相同数量的列。 Please note in the output the dates in the individual df's can be different and I would like to see the totals for the individual days.请注意在输出中各个 df 中的日期可能不同,我想查看各个日期的总数。

The code that I am generating all the individual df data-frames is:我生成所有单独的df数据帧的代码是:

for subdirname in glob.iglob('C:/Users/stacey/WorkDocs/tradeopt/'+filename+'//BBG*/tradeopt.is-pnl*.lzma', recursive=True):
    df = pd.DataFrame(numpy.zeros((0,27)))

    out = []
    with lzma.open(subdirname, mode='rt') as file:
        print(subdirname)
        for line in file:
            items = line.split(",")
            out.append(items)
            if len(out) > 0:
                a = pd.DataFrame(out[1:], columns=out[0])    

How can I add together the individal df's into a sumdf?如何将单个 df 加到 sumdf 中?

Idea is convert column date to DatetimeIndex and split columns names by .想法是将列dateDatetimeIndexsplit按列名. to MultiIndex :到多MultiIndex

dfs = [] 
for subdirname in glob.iglob('C:/Users/stacey/WorkDocs/tradeopt/'+filename+'//BBG*/tradeopt.is-pnl*.lzma', recursive=True): 
    out = []
    with lzma.open(subdirname, mode='rt') as file:
        print(subdirname)
        for line in file:
            items = line.strip().split(",")
            out.append(items)
    if len(out) > 0:
        a = pd.DataFrame(out[1:], columns=out[0]).set_index('date')
        a.index = pd.to_datetime(a.index)  
        dfs.append(a)

And then use concat and sum by columns names:然后按列名使用concatsum

df = pd.concat(dfs, axis=1).sum(level=0, axis=1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 比较具有不同列名的两个数据框,并使用来自第二个数据框的列更新第一个数据框 - Compare two data-frames with different column names and update first data-frame with the column from second data-frame 将多个未对齐的数据帧合并为单个pandas数据帧 - Merging multiple, unaligned data-frames into single pandas data-frame Python:如何使用 2 个数据帧创建新的数据帧,这两个数据帧之一的数据必须被其他列名替换 - Python: How To create new data-frame with 2 data-frames that datas from one of those two have to be replaced by other's column name Pandas:将函数应用于不同数据框的多列 - Pandas: apply a function to multiple columns of different data-frames 从具有不同行数的分组数据帧中计算平均值 - Compute average from grouped data-frames with different number of rows 找出两个不同数据帧的差异 - Find difference in two different data-frames 连接具有相同分区数但不同列数的两个数据帧(dask) - Concatenate two data-frames (dask) with the same number of partitions but different number of columns 根据列表的内容从数据框中选择列 - Selecting columns from a data-frame based on contents of a list 数据框仅显示来自列分区的选定结果 - Data-frame to show selected results only from columns divisions 如何计算多个数据帧的行和列? - How can one count the rows and columns of multiple data-frames?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM