簡體   English   中英

如何用熊貓划分兩個不同形狀的數據框?

[英]How to divide two dataframes of different shape with Pandas?

我有兩個具有相同索引但形狀不同的數據df1 ,並且無法將數據df1中的列與數據df2的列分開。

預期結果是df1 / df2

df1.head()
                           volume  volume        volume         volume  \
timestamp                                                                
2016-07-24 00:00:00+00:00     NaN     NaN           NaN            NaN   
2016-07-25 00:00:00+00:00     NaN     NaN           NaN            NaN   
2016-07-26 00:00:00+00:00     NaN     NaN           NaN  102720.829507   
2016-07-27 00:00:00+00:00     NaN     NaN  3.729644e+05  398346.509801   
2016-07-28 00:00:00+00:00     NaN     NaN  1.326648e+06  244165.794698   

                           volume        volume  volume        volume  
timestamp                                                              
2016-07-24 00:00:00+00:00     NaN           NaN     NaN  1.734943e+07  
2016-07-25 00:00:00+00:00     NaN           NaN     NaN  1.365341e+07  
2016-07-26 00:00:00+00:00     NaN           NaN     NaN  5.199938e+07  
2016-07-27 00:00:00+00:00     NaN  2.471076e+06     NaN  2.558753e+07  
2016-07-28 00:00:00+00:00     NaN  1.642990e+06     NaN  3.118785e+06

df2.head()

timestamp
2016-07-24 00:00:00+00:00    1.734943e+07
2016-07-25 00:00:00+00:00    1.365341e+07
2016-07-26 00:00:00+00:00    5.210210e+07
2016-07-27 00:00:00+00:00    2.882991e+07
2016-07-28 00:00:00+00:00    6.332589e+06
Freq: D, dtype: float64

df1.shape
Out[2126]: (723, 8)

df2.shape
Out[2127]: (723,)

df1.divide(df2, axis= 'index')
ValueError: operands could not be broadcast together with shapes (5784,) (723,) 

兩個數據幀具有不同的結構,但索引相同。

type(df1)
Out[2143]: pandas.core.frame.DataFrame

type(df2)
Out[2144]: pandas.core.series.Series

我讀到我需要重塑數據框之一,因此我嘗試了以下方法:

df1.divide(df2.reshape(723,1), axis= 'index')

但是它返回一個錯誤:

ValueError: Unable to coerce to DataFrame, shape must be (723, 8): given (723, 1)

當我將pd.DataFrame(df2)轉換為df2 ,它將引發錯誤:

TypeError: '<' not supported between instances of 'str' and 'int' 

我想念什么,我該怎么辦?

在使用除法(或div)功能時,應為每個數據幀中的相應列建立索引。

df1[['column_1','column_2']].divide(df2[['column_1']], axis= 'index')  

df1[['column_1','column_2']].div(df2[['column_1']], axis= 'index')

試試這種方法。 我使用了一個簡單的示例,但請告訴我這是否無效。

import pandas as pd
import numpy as np
from IPython.display import display, HTML

CSS = """
.output {
    flex-direction: row;
}
"""

HTML('<style>{}</style>'.format(CSS))


data1 = {"a":[1.,7.,12.],
         "b":[4.,8.,3.],
         "c":[5.,45.,67.]}
data2 = {"a":[3.],
         "b":[2.],
         "c":[8.]}

df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2) 
df2 = df2.T
df2 = df2.reset_index()
del df2['index']
display(df1)
display(df2)
display(df1.iloc[:,0:].truediv(df2[0], axis=0)) # this portion of code you want


ABC
0 1.0 4.0 5.0
1 7.0 8.0 45.0
2 12.0 3.0 67.0

0
0 3.0
1 2.0
2 8.0

ABC
0 0.333333 1.333333 1.666667
1 3.500000 4.000000 22.500000
2 1.500000 0.375000 8.375000

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM