I have a list of lists:
list_of_lists = []
list_1 = [-1, 0.67, 0.23, 0.11]
list_2 = [-1]
list_3 = [0.54, 0.24, -1]
list_4 = [0.2, 0.85, 0.8, 0.1, 0.9]
list_of_lists.append(list_1)
list_of_lists.append(list_2)
list_of_lists.append(list_3)
list_of_lists.append(list_4)
The position is meaningful. I want to return a list that contains the mean per position, excluding -1. That is, I want:
[(0.54+0.2)/2, (0.67+0.24+0.85)/3, (0.23+0.8)/2, (0.11+0.1)/2, 0.9/1]
which is actually:
[0.37, 0.5866666666666667, 0.515, 0.10500000000000001, 0.9]
How can I do this in a pythonic way?
EDIT:
I am working with Python 2.7, and I am not looking for the mean of each list; instead, I'm looking for the mean of 'all list elements at position 0 excluding -1', and the mean of 'all list elements at position 1 excluding -1', etc.
The reason I had:
[(0.54+0.2)/2, (0.67+0.24+0.85)/3, (0.23+0.8)/2, (0.11+0.1)/2, 0.9/1]
is that values in position 0 are -1, -1, 0.54, and 0.2, and I want to exclude -1; position 1 has 0.67, 0.24, and 0.85; position 3 has 0.23, -1, and 0.8, etc.
A solution without third-party libraries:
from itertools import zip_longest
from statistics import mean
def f(lst):
return [mean(x for x in t if x != -1) for t in zip_longest(*lst, fillvalue=-1)]
>>> f(list_of_lists)
[0.37, 0.5866666666666667, 0.515, 0.10500000000000001, 0.9]
It uses itertools.zip_longest
with fillvalue
set to -1
to "transpose" the list and set missing values to -1
(will be ignored at the next step). Then, a generator expression and statistics.mean
are used to filter out -1
s and get the average.
Here is a vectorised numpy
-based solution.
import numpy as np
a = [[-1, 0.67, 0.23, 0.11],
[-1],
[0.54, 0.24, -1],
[0.2, 0.85, 0.8, 0.1, 0.9]]
# first create non-jagged numpy array
b = -np.ones([len(a), max(map(len, a))])
for i, j in enumerate(a):
b[i][0:len(j)] = j
# count negatives per column (for use later)
neg_count = [np.sum(b[:, i]==-1) for i in range(b.shape[1])]
# set negatives to 0
b[b==-1] = 0
# calculate means
means = [np.sum(b[:, i])/(b.shape[0]-neg_count[i]) \
if (b.shape[0]-neg_count[i]) != 0 else 0 \
for i in range(b.shape[1])]
# [0.37,
# 0.58666666666666667,
# 0.51500000000000001,
# 0.10500000000000001,
# 0.90000000000000002]
You can use pandas module to process.Code would like this :
import numpy as np
import pandas as pd
list_1 = [-1, 0.67, 0.23, 0.11,np.nan]
list_2 = [-1,np.nan,np.nan,np.nan,np.nan]
list_3 = [0.54, 0.24, -1,np.nan,np.nan]
list_4 = [0.2, 0.85, 0.8, 0.1, 0.9]
df=pd.DataFrame({"list_1":list_1,"list_2":list_2,"list_3":list_3,"list_4":list_4})
df=df.replace(-1,np.nan)
print(list(df.mean(axis=1)))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.