简体   繁体   中英

How do I use enumerate in Python to compute STD of a list?

I'm trying to compute the standard deviation of a list vr . The list size is 32, containing an array of size 3980. This array represents a value at a given height (3980 heights).

First I split the data into 15 minute chunks, where the minutes are given in raytimes . raytimes is a list of size 32 as well (containing just the time of the observation, vr ).

I want the standard deviation computed at each height level, such that I end up with one final array of size 3980 . This happens OK in my code. However my code does not produce the correct standard deviation value when I test it — that is to say the values that are output to w1sd , w2sd etc, are not correct (however the array is the correct size: an array of 3980 elements). I assume I am mixing up the wrong indices when computing the standard deviation.

Below are example values from the dataset. All data should fall into w1 and w1sd as the raytimes provided in this example are all within 15 minutes (< 0.25). I want to compute the standard deviation of the first element of vr , that is, the standard deviation of 2.0 + 3.1 + 2.1 , then the second element, or standard deviation of 3.1 + 4.1 + nan etc. The result for w1sd SHOULD BE [0.497, 0.499, 1.0, 7.5] but instead the code as below gives a nanstd in w1sd = [0.497, 0.77, 1.31, 5.301] . Is it something wrong with nanstd or my indexing?

vr = [
    [2.0, 3.1, 4.1, nan],
    [3.1, 4.1, nan, 5.1],
    [2.1, nan, 6.1, 20.1]
]
Height = [10.0, 20.0, 30.0, 40]
raytimes = [0, 0.1, 0.2]

for j, h in enumerate(Height): 
    for i, t in enumerate(raytimes):
        if raytimes[i] < 0.25:
            w1.append(float(vr[i][j]))
        elif 0.25 <= raytimes[i] < 0.5:
            w2.append(float(vr[i][j]))
        elif 0.5 <= raytimes[i] < 0.75:
            w3.append(float(vr[i][j]))
        else:
            w4.append(float(vr[i][j]))
    w1sd.append(round(nanstd(w1), 3))
    w2sd.append(round(nanstd(w2), 3))
    w3sd.append(round(nanstd(w3), 3))
    w4sd.append(round(nanstd(w4), 3))
    w1 = []
    w2 = []
    w3 = []
    w4 = []

I would consider using pandas for this. It is a library that allows for efficient processing of datasets in numpy arrays and takes all the looping and indexing out of your hands.

In this case I would define a dataframe with N_raytimes rows and N_Height columns, which would allow to easily slice and aggregate the data any way you like.

This code gives the expected output.

import pandas as pd
import numpy as np

vr = [
    [2.0, 3.1, 4.1, np.nan],
    [3.1, 4.1, np.nan, 5.1],
    [2.1, np.nan, 6.1, 20.1]
]
Height = [10.0, 20.0, 30.0, 40]
raytimes = [0, 0.1, 0.2]

# Define a dataframe with the data
df = pd.DataFrame(vr, columns=Height, index=raytimes)
df.columns.name = "Height"
df.index.name = "raytimes"

# Split it out (this could be more elegant)
w1 = df[df.index < 0.25]
w2 = df[(df.index >= 0.25) & (df.index < 0.5)]
w3 = df[(df.index >= 0.5) & (df.index < 0.75)]
w4 = df[df.index >= 0.75]

# Compute standard deviations
w1sd = w1.std(axis=0, ddof=0).values
w2sd = w2.std(axis=0, ddof=0).values
w3sd = w3.std(axis=0, ddof=0).values
w4sd = w4.std(axis=0, ddof=0).values

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM