[英]Using pandas how to find the average of certain elements in a column based on their date?
[英]How to find the average of a list of elements imbedded in a Pandas data frame column
我正在清理數據框,一個特定的列包含由列表組成的值。 我試圖找到這些列表的平均值並用 int 更新現有列,同時保留索引。 我可以成功有效地將這些值轉換為列表,但在此過程中我丟失了索引值。 我在下面編寫的代碼太占用內存而無法執行。 有沒有更簡單的代碼可以工作?
數據: https://docs.google.com/spreadsheets/d/1Od7AhXn9OwLO-SryT--erqOQl_NNAGNuY4QPSJBbI18/edit?usp=sharing
def Average(lst):
sum1 = 0
average = 0
if len(x) == 1:
for obj in x:
sum1 = int(obj)
if len(x)>1:
for year in x:
sum1 += int(year)
average = sum1/len(x)
return mean(average)
hello = hello[hello.apply([lambda x: mean(x) for x in hello])]
這是我用來將值轉換為列表的循環:
df_list1 = []
for x in hello:
sum1 = 0
average = 0
if len(x) == 1:
for obj in x:
df_list1.append(int(obj))
if len(x)>1:
for year in x:
sum1 += int(year)
average = sum1/len(x)
df_list1.append(int(average))
使用apply
和np.mean
。
import numpy as np
df = pd.DataFrame(data={'listcol': [np.random.randint(1, 10, 5) for _ in range(3)]}, index=['a', 'b', 'c'])
# np.mean will return NaN on empty list
df['listcol'] = df['listcol'].fillna([])
# can use this if all elements in lists are numeric
df['listcol'] = df['listcol'].apply(lambda x: np.mean(x))
# use this instead if list has numbers stored as strings
df['listcol'] = df['listcol'].apply(lambda x: np.mean([int(i) for i in x]))
Output
>>>df
listcol
a 5.0
b 5.2
c 4.4
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.