如何找到嵌入在 Pandas 數據框列中的元素列表的平均值

Question

我正在清理數據框，一個特定的列包含由列表組成的值。 我試圖找到這些列表的平均值並用 int 更新現有列，同時保留索引。 我可以成功有效地將這些值轉換為列表，但在此過程中我丟失了索引值。 我在下面編寫的代碼太占用內存而無法執行。 有沒有更簡單的代碼可以工作？

數據： https://docs.google.com/spreadsheets/d/1Od7AhXn9OwLO-SryT--erqOQl_NNAGNuY4QPSJBbI18/edit?usp=sharing

def Average(lst):
    sum1 = 0
    average = 0
    if len(x) == 1:
        for obj in x:
            sum1 = int(obj)

    if len(x)>1:
        for year in x:
            sum1 += int(year)
        average = sum1/len(x)

    return mean(average) 

hello = hello[hello.apply([lambda x: mean(x) for x in hello])]

這是我用來將值轉換為列表的循環：

df_list1 = []

for x in hello:
        sum1 = 0
        average = 0
        if len(x) == 1:
            for obj in x:
                df_list1.append(int(obj))

        if len(x)>1:
            for year in x:
                sum1 += int(year)
                average = sum1/len(x)
            df_list1.append(int(average))

Answer 1

使用apply和np.mean 。

import numpy as np

df = pd.DataFrame(data={'listcol': [np.random.randint(1, 10, 5) for _ in range(3)]}, index=['a', 'b', 'c'])

# np.mean will return NaN on empty list
df['listcol'] = df['listcol'].fillna([])

# can use this if all elements in lists are numeric
df['listcol'] = df['listcol'].apply(lambda x: np.mean(x))

# use this instead if list has numbers stored as strings
df['listcol'] = df['listcol'].apply(lambda x: np.mean([int(i) for i in x]))

Output

>>>df
   listcol
a      5.0
b      5.2
c      4.4

如何找到嵌入在 Pandas 數據框列中的元素列表的平均值

問題描述

1 個解決方案

解決方案1
0 2020-04-22 23:37:51

如何找到嵌入在 Pandas 數據框列中的元素列表的平均值

問題描述

1 個解決方案

解決方案1 0 2020-04-22 23:37:51

解決方案1
0 2020-04-22 23:37:51