简体   繁体   English

如何根据x值展平时的步长函数数组进行分块

[英]How to chunk up an a step function array based on when x-values flatten out

I am running into a problem that I am having trouble figuring out in python (which I will currently blame on sever jetlag). 我遇到了一个问题,我在python中遇到了麻烦(我现在将其归咎于服务器jetlag)。

I have an array, let's call it x. 我有一个数组,我们称之为x。 The plot of x where y-axis is generic value, x-axis is index of array, looks like: x的图,其中y轴是通用值,x轴是数组的索引,如下所示:

y轴是通用值,x轴是数组的索引

What I want to do is isolate the flat sections after the initial bump (see next picture that I am interested in): 我想要做的是在初始凹凸后隔离平坦部分(参见我感兴趣的下一张图片):

我感兴趣的块

I want to ignore the leading flat line and bump, and make an array of the five red boxes in the second image such that I have something like 我想忽略前导的扁平线和凹凸,并在第二张图像中制作一个五个红色框的数组,这样我就像

x_chunk = [[box 0], [box 1], [box 2], [box 3], [box 4]] x_chunk = [[方框0],[方框1],[方框2],[方框3],[方框4]]

I want to ignore all of the sloped transition line between the red chunks. 我想忽略红色块之间的所有倾斜过渡线。 I am having trouble figuring out the proper iterating procedure and setting the condition such that I get what I need. 我无法弄清楚正确的迭代过程并设置条件以便得到我需要的东西。

So, this is probably not the cleanest solution, however it works: 所以,这可能不是最干净的解决方案,但它有效:

import numpy as np
import matplotlib.pyplot as plt

# Create data
r=np.random.random(50)
y1 = np.array([50,40,30,20,10])
y=np.repeat(y1,10)
y[9]=y[9]+10
y=y+r

# Plot data
x=np.arange(len(y))
plt.plot(x,y)
plt.show()

Will give you something like this: 会给你这样的东西:

示例数据

# Find maximum and start from there
idxStart=np.argmax(y)
y2=y[idxStart:]

# Grab jump indices
idxs=np.where(np.diff(y2)<-1)[0]+1

# Put into boxes
boxs=[]
for i in range(len(idxs)-1):
    boxs.append(y2[idxs[i]:idxs[i+1]])

print boxs  

Of course you will need to find the right threshold to distinguish the "jumps/drops" in the data, in my case -1 was good enough since random returns values between 0 and 1. Hope your jetlag gets better soon. 当然,你需要找到合适的阈值来区分数据中的“跳跃/下降”,在我的情况下, -1足够好,因为random返回0到1之间的值。希望你的时差很快好起来。

Not tested as I have no data, but something like this should work 没有测试,因为我没有数据,但这样的事情应该工作

def findSteps(arr, thr=.02, window=10, disc=np.std):
    d = disc(np.lib.stride_tricks.as_strided(arr, strides = arr.strides*2, shape = (arr.size-window+1, window)), axis = 1)
    m = np.minimum(np.abs(d[:-window]), np.abs(d[window:])) < thr
    i = np.nonzero(np.diff(m))
    return np.split(arr[window:-window], i)[::2]

May have to play around with the window and threshold value, and you may want to write a slope function for disc if np.std doesn't work, but the basic idea is looking forward and backward by window steps and seeing if the standard deviation (or slope) of the stride is close to 0. 可能必须使用窗口和阈值,如果np.std不起作用,您可能想要为disc编写斜率函数,但基本思路是通过window步骤向前和向后查看并查看标准偏差步幅(或斜率)接近0。

You'll end up with blocks of True values, which you find the start and end of by np.nonzero(np.diff()) 您最终会得到True值的块,您可以通过np.nonzero(np.diff())找到它的开始和结束

You then np.split the array into a list of arrays by the blocks and only take every other member of the list (since the other sub-arrays will be the transitions). 然后,您可以通过块将数组np.split到一个数组列表中,并且只接受列表中的每个其他成员(因为其他子数组将是转换)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM