簡體   English   中英

如何找到嵌入在 Pandas 數據框列中的元素列表的平均值

[英]How to find the average of a list of elements imbedded in a Pandas data frame column

我正在清理數據框,一個特定的列包含由列表組成的值。 我試圖找到這些列表的平均值並用 int 更新現有列,同時保留索引。 我可以成功有效地將這些值轉換為列表,但在此過程中我丟失了索引值。 我在下面編寫的代碼太占用內存而無法執行。 有沒有更簡單的代碼可以工作?

數據: https://docs.google.com/spreadsheets/d/1Od7AhXn9OwLO-SryT--erqOQl_NNAGNuY4QPSJBbI18/edit?usp=sharing

def Average(lst):
    sum1 = 0
    average = 0
    if len(x) == 1:
        for obj in x:
            sum1 = int(obj)

    if len(x)>1:
        for year in x:
            sum1 += int(year)
        average = sum1/len(x)

    return mean(average) 

hello = hello[hello.apply([lambda x: mean(x) for x in hello])]

這是我用來將值轉換為列表的循環:

df_list1 = []

for x in hello:
        sum1 = 0
        average = 0
        if len(x) == 1:
            for obj in x:
                df_list1.append(int(obj))

        if len(x)>1:
            for year in x:
                sum1 += int(year)
                average = sum1/len(x)
            df_list1.append(int(average))

使用applynp.mean

import numpy as np

df = pd.DataFrame(data={'listcol': [np.random.randint(1, 10, 5) for _ in range(3)]}, index=['a', 'b', 'c'])

# np.mean will return NaN on empty list
df['listcol'] = df['listcol'].fillna([])

# can use this if all elements in lists are numeric
df['listcol'] = df['listcol'].apply(lambda x: np.mean(x))

# use this instead if list has numbers stored as strings
df['listcol'] = df['listcol'].apply(lambda x: np.mean([int(i) for i in x])) 

Output

>>>df
   listcol
a      5.0
b      5.2
c      4.4

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM