簡體   English   中英

用python索引熊貓數據框內的熊貓數據框

[英]Indexing pandas dataframes inside pandas dataframes with python

我在數據框內有一系列數據框。

頂層數據框的結構如下:

    24hr   48hr   72hr
D1  x      x      x
D2  x      x      x 
D3  x      x      x

在每種情況下,x都是使用pandas.read_excel()創建的數據pandas.read_excel()

每個x數據框中的一列標題為“平均容器長度”,並且該列中有三個條目(即行,索引)。

我要返回的是“平均船長”列的平均值。 我也對如何返回該列中的特定單元格感興趣。 我知道有一種用於熊貓數據幀的.mean方法,但是我無法弄清楚使用它的索引語法。

下面是一個例子

import pandas as pd

a = {'Image name' : ['Image 1', 'Image 2', 'Image 3'], 'threshold' : [20, 25, 30], 'Average Vessels Length' : [14.2, 22.6, 15.7] }
b = pd.DataFrame(a, columns=['Image name', 'threshold', 'Average Vessels Length'])

c = pd.DataFrame(index=['D1','D2','D3'], columns=['24hr','48hr','72hr'])
c['24hr']['D1'] = a
c['48hr']['D1'] = a
c['72hr']['D1'] = a
c['24hr']['D2'] = a
c['48hr']['D2'] = a
c['72hr']['D2'] = a
c['24hr']['D3'] = a
c['48hr']['D3'] = a
c['72hr']['D3'] = a

這將返回“平均容器長度”列中的值的平均值:

print b['Average Vessels Length'].mean()

這將返回24小時,D1,“平均船只長度”中的所有值

print c['24hr']['D1']['Average Vessels Length']

這不起作用:

print c['24hr']['D1']['Average Vessels Length'].mean()

而且我不知道如何訪問c ['24hr'] ['D1'] ['平均船只長度']中的任何特定值

最終,我想從Dx ['Average Vessels Length']。mean()的每一列中取平均值,然后將其除以相應的D1 ['Average Vessels Length']。mean()

任何幫助將不勝感激。

我假設既然您說大數據框的每個元素都是一個數據框,那么示例數據應該是:

import pandas as pd

a = {'Image name' : ['Image 1', 'Image 2', 'Image 3'], 'threshold' : [20, 25, 30], 'Average Vessels Length' : [14.2, 22.6, 15.7] }
b = pd.DataFrame(a, columns=['Image name', 'threshold', 'Average Vessels Length'])

c = pd.DataFrame(index=['D1','D2','D3'], columns=['24hr','48hr','72hr'])
c['24hr']['D1'] = b
c['48hr']['D1'] = b
c['72hr']['D1'] = b
c['24hr']['D2'] = b
c['48hr']['D2'] = b
c['72hr']['D2'] = b
c['24hr']['D3'] = b
c['48hr']['D3'] = b
c['72hr']['D3'] = b

要獲取每個單元格的均值,可以使用applymap ,它將函數映射到DataFrame的每個單元格:

cell_means = c.applymap(lambda e: e['Average Vessels Length'].mean())
cell_means
Out[14]: 
    24hr  48hr  72hr
D1  17.5  17.5  17.5
D2  17.5  17.5  17.5
D3  17.5  17.5  17.5

一旦有了這些喲,就可以得到列均值等,然后繼續以均值歸一化:

col_means = cell_means.mean(axis=0)
col_means
Out[11]: 
24hr    17.5
48hr    17.5
72hr    17.5
dtype: float64

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM