简体   繁体   English

如何计算每分钟数据集的 15 分钟标准偏差?

[英]How to compute 15-min standard deviations of a minutely dataset?

I have one array, raytimes, which is a fraction of an hour, eg, [0, 0.1, 0.2... 0.9, 1.0].我有一个数组,raytimes,它是一个小时的一小部分,例如,[0, 0.1, 0.2... 0.9, 1.0]。 I have another list of floats, which is my velocity vr.我有另一个浮点数列表,这是我的速度 vr。 Each data time corresponds to a list of velocities as a function of height.每个数据时间对应于作为高度函数的速度列表。

I'm trying to compute a 15-min standard deviation of velocity from this velocity dataset, and keep it so that the std is performed at each height level (so I should have an array of the standard deviations, one for each height).我正在尝试从这个速度数据集计算 15 分钟的速度标准偏差,并保留它以便在每个高度级别执行标准差(所以我应该有一个标准偏差数组,每个高度一个)。

raytimes is time [0, 0.1, 0.2... 0.9, 1.0].光线时间是时间 [0, 0.1, 0.2... 0.9, 1.0]。 vr is a 108 list of 2500 float64 numbers. vr 是一个包含 2500 个 float64 数字的 108 列表。 The 2500 numbers correspond to the velocity measured at each height (on a fixed height grid). 2500 个数字对应于在每个高度(在固定高度网格上)测量的速度。 I don't know how to separate the chunks of data such that I can compute std on just the first, second, third, and fourth 15-min intervals.我不知道如何分离数据块,以便我可以仅在第一个、第二个、第三个和第四个 15 分钟间隔内计算 std。 THEN I need to compute the std at each specific height level.然后我需要计算每个特定高度的标准。

for i in raytimes:
    if raytimes[i] < 0.25:
        w1 = w1.append(vr)
    if raytimes[i] > 0.25 & raytimes < 0.5:
        w2 = w2.append(vr)
    if raytimes[i] > 0.5 & raytimes < 0.75:
        w3 = w3.append(vr)
    if raytimes[i] < 1:
        w4 = w4.append(vr)
sigma_w1 = std(w1)
sigma_w2 = std(w2)
etc...

Problem is in my above code I am appending the entire vr matrix.问题是在我上面的代码中,我附加了整个 vr 矩阵。 How do I only append the list of vrs that correspond to times within the 15min chunk?如何仅附加与 15 分钟块内的时间相对应的 vrs 列表? And then how to I compute the std maintaining the height grid, so the std is computed along each height?然后如何计算保持高度网格的标准差,以便沿每个高度计算标准差? I should end up with the same array size of 2500.我应该得到相同的数组大小 2500。

Here's the start of an answer which I can refine based on your feedback.这是我可以根据您的反馈进行完善的答案的开始。 Note this is not really a great way to actually transform the data, I'm just trying to demonstrate how to move your code toward something which gives the answer you want.请注意,这并不是真正转换数据的好方法,我只是想演示如何将您的代码移动到提供您想要的答案的东西上。 Here I've assumed that you want an SD grouped by height and 15-minute period;在这里,我假设您想要按高度和 15 分钟时间段分组的 SD; so that's 10000 results.所以这是 10000 个结果。 If you actually wanted the SD over height or some other grouping function let me know in the comments.如果你真的想要 SD over height 或其他一些分组功能,请在评论中告诉我。 I've also assumed from what you said above that vr is a list of lists.我还根据你上面所说的假设 vr 是一个列表列表。 Specifically a list of length 108 of lists of length 2500. If that's not right, leave a comment.特别是长度为 2500 的列表中长度为 108 的列表。如果这不正确,请发表评论。

EDIT - I realised there's a fundamental mistake in how you are using the for loop which I had inadvertently copied.编辑 - 我意识到您如何使用我无意中复制的 for 循环存在根本性错误。 You are using i as the index, but i is the actual value of the item.您使用 i 作为索引,但 i 是项目的实际值。 If you want the position of the item, you need to use enumerate.如果你想要item的位置,你需要使用enumerate。 See my example below, I've made i the index and t the value for raytimes.请参阅下面的示例,我已将 i 设为索引,并将 t 设为光线时间的值。

EDIT 2 - approach remains the same but I've actually run this code so I've corrected all the various mistakes that you and I had made in the previous iteration.编辑 2 - 方法保持不变,但我实际上已经运行了这段代码,所以我已经纠正了你和我在上一次迭代中犯的所有各种错误。 Can you try this with your data and confirm that the output is correct, then we can look at how you need the output presented.你能用你的数据试试这个并确认输出是正确的,然后我们可以看看你需要如何呈现输出。

EDIT 3 - added four results lists to hold the output as requested编辑 3 - 添加了四个结果列表以按要求保存输出

from statistics import pstdev
#remove these lines, these are just test data
raytimes=[0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1]
heights=[1,2]
vr=[[4,3],[5,3],[3,5],[4,1],[8,2],[2,3],[1,4],[9,5],[7,3],[6,7],[3,8]]
#initialise empty arrays
w1 = []
w2 = []
w3 = []
w4 = []
r1 = []
r2 = []
r3 = []
r4 = []


for j, h in enumerate(heights):
    for i, t in enumerate(raytimes):
        if raytimes[i] < 0.25:
            w1.append(vr[i][j])
        elif 0.25 < raytimes[i] < 0.5:
            w2.append(vr[i][j])
        elif 0.5 < raytimes[i] < 0.75:
            w3.append(vr[i][j])
        else:
            w4.append(vr[i][j])
    print(w1,w2,w3,w4)
    print("First Period - Height: ", str(h), " SD: ", str(pstdev(w1)))
    r1.append(pstdev(w1))
    print("Second Period - Height: ", str(h), " SD: ", str(pstdev(w2)))
    r2.append(pstdev(w2))
    print("Third Period - Height: ", str(h), " SD: ", str(pstdev(w3)))
    r3.append(pstdev(w3))
    print("Fourth Period - Height: ", str(h), " SD: ", str(pstdev(w4)))
    r4.append(pstdev(w4))
    w1 = []
    w2 = []
    w3 = []
    w4 = []

Ok, we can do that.好的,我们可以做到。 So your expected output is a 2500-long list of 4-long lists right?所以你的预期输出是一个 2500 长的 4 长列表,对吗? 10000 values in total?总共 10000 个值? I think the issue you are having is that you're trying to assign values outside the range of the list, you can't grow a list that way.我认为您遇到的问题是您试图分配列表范围之外的值,您不能以这种方式增加列表。

Edit - oops this wasn't supposed to be an answer.编辑 - 哎呀,这不应该是一个答案。 Mistook it for the comment box on mobile.把它误认为是手机上的评论框。 Nevermind没关系

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何从熊猫系列或数据框计算最大15分钟的总和 - How to calculate the maximum 15-min sum from a Pandas Series or Dataframe 在Django中两个小时之间创建15分钟的广告位 - Create 15-min slots between two hours in Django 如何绘制标准差 - How to plot Standard Deviations 如何为 plot 中的每个点 plot 不同的标准偏差? - How to plot different standard deviations for each point in the plot? 如何计算一个数字与平均值有多少标准偏差? - How to calculate how many standard deviations a number is from the mean? 如何在matplotlib中显示标准偏差以及正态分布图中的值? - How to show standard deviations along with the values in normal distribution graph in matplotlib? Plotly:如何为标准差制作具有多条线和阴影区域的图形? - Plotly: How to make a figure with multiple lines and shaded area for standard deviations? plot 物种标准差 - python - plot of the standard deviations by species - python 如何从给定的计数、平均值、标准差、最小值、最大值等生成数据集? - How to generate dataset from given count, mean, standard deviation, min, max etc? 如何将分数分配到数据点列表,然后输出与python中的平均值相比&gt; 2个标准差的值? - How can I assign scores to a list of datapoints and then output values > 2 standard deviations from the mean in python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM