简体   繁体   中英

calculate histogram peaks in python

In Python, how do I calcuate the peaks of a histogram?

I tried this:

import numpy as np
from scipy.signal import argrelextrema

data = [0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 1, 2, 3, 4,

        5, 6, 7, 8, 9, 5, 6, 7, 8, 9, 5, 6, 7, 8, 9,

        12,

        15, 16, 17, 18, 19, 15, 16, 17, 18, 

        19, 20, 21, 22, 23, 24,]

h = np.histogram(data, bins=[0, 5, 10, 15, 20, 25])
hData = h[0]
peaks = argrelextrema(hData, np.greater)

But the result was:

(array([3]),)

I'd expect it to find the peaks in bin 0 and bin 3.

Note that the peaks span more than 1 bin. I don't want it to consider the peaks that span more than 1 column as additional peak.

I'm open to another way to get the peaks.

Note:

>>> h[0]
array([19, 15,  1, 10,  5])
>>> 

In computational topology, the formalism of persistent homology provides a definition of "peak" that seems to address your need. In the 1-dimensional case the peaks are illustrated by the blue bars in the following figure:

最持久的峰值

A description of the algorithm is given in this Stack Overflow answer of a peak detection question .

The nice thing is that this method not only identifies the peaks but it quantifies the "significance" in a natural way.

A simple and efficient implementation (as fast as sorting numbers) and the source material to the above answer given in this blog article: https://www.sthu.org/blog/13-perstopology-peakdetection/index.html

I wrote an easy function:

def find_peaks(a):
  x = np.array(a)
  max = np.max(x)
  lenght = len(a)
  ret = []
  for i in range(lenght):
      ispeak = True
      if i-1 > 0:
          ispeak &= (x[i] > 1.8 * x[i-1])
      if i+1 < lenght:
          ispeak &= (x[i] > 1.8 * x[i+1])

      ispeak &= (x[i] > 0.05 * max)
      if ispeak:
          ret.append(i)
  return ret

I defined a peak as a value bigger than 180% that of the neighbors and bigger than 5% of the max value. Of course you can adapt the values as you prefer in order to find the best set up for your problem.

Try the findpeaks library.

pip install findpeaks

# Your input data:
data = [0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 7, 8, 9, 5, 6, 7, 8, 9, 5, 6, 7, 8, 9, 12, 15, 16, 17, 18, 19, 15, 16, 17, 18,  19, 20, 21, 22, 23, 24,]

# import library
from findpeaks import findpeaks

# Find some peaks using the smoothing parameter.
fp = findpeaks(lookahead=1, interpolate=10)
# fit
results = fp.fit(data)
# Make plot
fp.plot()

输入数据

# Results with respect to original input data.
results['df']

# Results based on interpolated smoothed data.
results['df_interp']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM