Python-在数组中查找序列：当两个值低于阈值时开始，当两个值高于阈值时结束

Question

我有一个干旱指数（PDSI）的年度时间序列，其值介于-4到+4之间。 我正在尝试定义一个干旱事件，该事件从PDSI值低于0的连续两年开始，到连续两年大于或等于0的时候结束。

例如，在这一系列数据中：

ts = [-2, -2, -4,  0, -1,  0, -1,  1, -2,  2, -3, -2,  3,  1, -2, 
      -3, -4, -3,  3, -3, -3, -3, -1, -3,  3,  3, -4, -1, -1,  0]

注意：我尝试过发布图像以帮助可视化问题，但是我的声誉不够高

根据上面的干旱定义，本系列应该存在三种干旱：

1）从第0年开始，到第11年结束（第12和13年>= 0）

2）从第14年开始，到第23年结束（第24和25年>= 0）

3）从第26年开始，到系列结束时结束：第29年。即使这种干旱没有在连续两年>= 0时结束，但仍在继续，应予统计。

返回值可以是一个数组，如：

droughts = [[0, 11], [14, 23], [26, 29]]

这意味着排除具有两个连续值PDSI <0的任何潜在子集。例如，在第一个序列[0，11]中，也确实[1,2]和[10,11]满足“两个连续值”低于阈值规则。 但是，由于它们是较大序列的一部分，因此应将其忽略。

编辑：

这是一些用于定义前两次干旱的代码，但挂在最后一个干旱上（我认为它无限循环）。 我是Python的新手，除了它无法正常工作之外，我猜它的效率也很低。

def find_droughts (array):
answer = []
i = 0
while i < len(array):
    if (array[i] < 0 and array[i+1] < 0):
        if i+1 >= len(array):
            i = len(array)
            end = i
            a.append([start, end])
            break
        else:
            start = i
            print "start = %s" %start
        for j in range(i+2, len(array)-1):
            if (array[j] >= 0 and array[j+1] >= 0):
                end = j-1
                a.append([start, end])
                print 'end=%s' %end
                i = j+2;
                break
            else:
                i += 1
    else:
        i += 1
return answer

find_droughts(ts)

以及下面的输出。 由于内核陷入循环，不得不中断内核。

start = 0
end=11
start = 14
end=23
start = 26
start = 27
start = 27
start = 27
start = 27
....

Answer 1

这样的事情怎么样：

ts = [-2, -2, -4,  0, -1,  0, -1,  1, -2,  2, -3, -2,  3,  1, -2,
      -3, -4, -3,  3, -3, -3, -3, -1, -3,  3,  3, -4, -1, -1,  0]

# find positions of 2 consecutive negatives
neg = [i for i in range(len(ts)-1) if ts[i]<0 and ts[i+1] < 0]
print neg

# find locations of 2 consecutive positives + last year
pos = [i for i in range(len(ts)-1) if ts[i]>0 and ts[i+1] > 0] + [len(ts)]
print pos

# find the earliest neg for each pos
draughts = []
for p in pos:
    try:
        draughts.append((neg[0],p))
        neg = [n for n in neg if n > p]
    except IndexError:
        # no more negatives left, no draught in progress
        break

print draughts

输出：

[0, 1, 10, 14, 15, 16, 19, 20, 21, 22, 26, 27]
[12, 24, 30]
[(0, 12), (14, 24), (26, 30)]

有一些需要解决的问题和边缘情况，但总体来说...

这是另一种方法，只需要对ts进行一次传递：

ts = [-2, -2, -4,  0, -1,  0, -1,  1, -2,  2, -3, -2,  3,  1, -2,
      -3, -4, -3,  3, -3, -3, -3, -1, -3,  3,  3, -4, -1, -1,  0]

in_draught = False
draught = []

for i in range(len(ts)-1):
    if in_draught and ts[i] > 0 and ts[i+1] > 0:
        in_draught = False
        draught.append(i)
    elif not in_draught and ts[i] <0 and ts[i+1] < 0:
        in_draught = True
        draught.append(i)
if in_draught:
    draught.append(len(ts)-1)

print [draught[i:i+2] for i in range(0,len(draught),2)]

输出：

[[0, 12], [14, 24], [26, 29]]

Python-在数组中查找序列：当两个值低于阈值时开始，当两个值高于阈值时结束

问题描述

1 个解决方案

解决方案1
0 已采纳 2015-09-09 19:24:40

Python-在数组中查找序列：当两个值低于阈值时开始，当两个值高于阈值时结束

问题描述

1 个解决方案

解决方案1 0 已采纳 2015-09-09 19:24:40

解决方案1
0 已采纳 2015-09-09 19:24:40