简体   繁体   中英

Python Data Frame how to find the local maximum in a 2D array

I have a Data Frame of two columns namely x,y. I want to find the local maximums in x,y plot as shown in figure 1 of attached plot. I followed this way: converted each column of data frame into two separate matrix arrays. Step 1: My code first identifies index positions of local maximums in Y. Step 2: value of x corresponding to the those index positions will be identified. That's it. As a result, i could found two local maximums only. But, there are there three local maximums. My method fail to identify it. My question: is there a way I can identify the local maximum directly from 2D array ?

My present code:

x = my_dataframe.iloc[:,0].values # conversion of Data frame column into an array
y = my_dataframe.iloc[:,2].values # conversion of Data frame column into an array        

# Step 1: for local maximum in y list
local_y_index = argrelextrema(y, np.greater)
print("Index position of local maximum in y = ",local_y_index[0])

# Step 2: Below code is for identifying the value of x at local maximum
local_x = x[local_mpp_index[0]]
print("value of x corresponding to local maximum in y = ",local_x)

The output is:

Index position of local maximum in y =  [105 197]
value of x corresponding to local maximum in y =  [149.21 281.06]

My question: As shown in Figure 1, my above approach has identified two local peaks only. But there are three peaks. Is there a better approach to identify the local maximum directly from 2D array of x and y?

在此处输入图片说明

x = [1.0330e-01, 1.0380e-01, 1.0430e-01, 1.0680e-01, 1.1932e-01, 1.8192e-01,
 3.6365e-01, 5.4539e-01, 7.9191e-01, 1.0384e+00, 1.3626e+00, 1.6869e+00,
 1.7438e+00, 2.0286e+00, 2.4825e+00, 2.9363e+00, 3.4787e+00, 4.0212e+00,
 4.7129e+00, 5.2137e+00, 6.0460e+00, 6.9486e+00, 7.8511e+00, 8.6835e+00,
 1.0092e+01, 1.0418e+01, 1.2153e+01, 1.3888e+01, 1.5623e+01, 1.7358e+01,
 1.9093e+01, 2.0828e+01, 2.2563e+01, 2.4298e+01, 2.6033e+01, 2.7768e+01,
 2.9503e+01, 3.1237e+01, 3.2972e+01, 3.4707e+01, 3.6442e+01, 3.8177e+01,
 3.9912e+01, 4.1647e+01, 4.3382e+01, 4.5117e+01, 4.6852e+01, 4.8587e+01,
 5.0322e+01, 5.2056e+01, 5.3791e+01, 5.5526e+01, 5.7261e+01, 5.8996e+01,
 6.0731e+01, 6.2466e+01, 6.4201e+01, 6.5936e+01, 6.7671e+01, 6.9406e+01,
 7.1141e+01, 7.2875e+01, 7.4610e+01, 7.6345e+01, 7.8080e+01, 7.9815e+01,
 8.1550e+01, 8.3285e+01, 8.5020e+01, 8.6755e+01, 8.8490e+01, 9.0225e+01,
 9.1960e+01, 9.3694e+01, 9.5429e+01, 9.7164e+01, 9.8899e+01, 1.0063e+02,
 1.0237e+02, 1.0410e+02, 1.0584e+02, 1.0757e+02, 1.0931e+02, 1.1104e+02,
 1.1278e+02, 1.1451e+02, 1.1625e+02, 1.1798e+02, 1.1972e+02, 1.2145e+02,
 1.2319e+02, 1.2492e+02, 1.2666e+02, 1.2839e+02, 1.3013e+02, 1.3186e+02,
 1.3360e+02, 1.3533e+02, 1.3707e+02, 1.3880e+02, 1.4054e+02, 1.4227e+02,
 1.4401e+02, 1.4574e+02, 1.4748e+02, 1.4921e+02, 1.5095e+02, 1.5268e+02,
 1.5442e+02, 1.5615e+02, 1.5684e+02, 1.5753e+02, 1.5789e+02, 1.5861e+02,
 1.5934e+02, 1.5962e+02, 1.6056e+02, 1.6136e+02, 1.6256e+02, 1.6309e+02,
 1.6482e+02, 1.6656e+02, 1.6829e+02, 1.7003e+02, 1.7176e+02, 1.7350e+02,
 1.7523e+02, 1.7697e+02, 1.7870e+02, 1.8044e+02, 1.8217e+02, 1.8391e+02,
 1.8564e+02, 1.8738e+02, 1.8911e+02, 1.9085e+02, 1.9258e+02, 1.9432e+02,
 1.9605e+02, 1.9779e+02, 1.9952e+02, 2.0126e+02, 2.0299e+02, 2.0473e+02,
 2.0646e+02, 2.0820e+02, 2.0993e+02, 2.1167e+02, 2.1340e+02, 2.1514e+02,
 2.1687e+02, 2.1861e+02, 2.1927e+02, 2.1993e+02, 2.2034e+02, 2.2103e+02,
 2.2172e+02, 2.2208e+02, 2.2296e+02, 2.2381e+02, 2.2493e+02, 2.2555e+02,
 2.2700e+02, 2.2728e+02, 2.2871e+02, 2.2902e+02, 2.3057e+02, 2.3075e+02,
 2.3164e+02, 2.3249e+02, 2.3422e+02, 2.3596e+02, 2.3769e+02, 2.3943e+02,
 2.4116e+02, 2.4290e+02, 2.4463e+02, 2.4637e+02, 2.4810e+02, 2.4984e+02,
 2.5157e+02, 2.5331e+02, 2.5504e+02, 2.5678e+02, 2.5851e+02, 2.6025e+02,
 2.6198e+02, 2.6371e+02, 2.6545e+02, 2.6718e+02, 2.6892e+02, 2.7065e+02,
 2.7239e+02, 2.7412e+02, 2.7586e+02, 2.7759e+02, 2.7933e+02, 2.8106e+02,
 2.8280e+02, 2.8453e+02, 2.8627e+02, 2.8800e+02, 2.8974e+02, 2.9147e+02,
 2.9321e+02, 2.9494e+02, 2.9668e+02, 2.9841e+02, 3.0015e+02, 3.0188e+02,
 3.0362e+02, 3.0535e+02, 3.0709e+02, 3.0882e+02, 3.1056e+02, 3.1229e+02,
 3.1403e+02, 3.1576e+02, 3.1749e+02, 3.1923e+02, 3.2096e+02, 3.2270e+02,
 3.2443e+02, 3.2617e+02, 3.2790e+02, 3.2964e+02, 3.3137e+02, 3.3311e+02,
 3.3484e+02, 3.3658e+02, 3.4686e+02, 3.4686e+02, 3.4686e+02, 3.4686e+02,
 3.4686e+02, 3.4686e+02, 3.4686e+02, 3.4686e+02, 3.4687e+02]

y = [4.2014e-01, 4.2237e-01, 4.2460e-01, 4.3574e-01, 4.9146e-01, 7.7004e-01,
     1.5788e+00, 2.3874e+00, 3.4842e+00, 4.5808e+00, 6.0228e+00, 7.4647e+00,
     7.7180e+00, 8.9843e+00, 1.1002e+01, 1.3020e+01, 1.5431e+01, 1.7842e+01,
     2.0916e+01, 2.3141e+01, 2.6839e+01, 3.0848e+01, 3.4856e+01, 3.8552e+01,
     4.4807e+01, 4.6254e+01, 5.3953e+01, 6.1650e+01, 6.9344e+01, 7.7035e+01,
     8.4723e+01, 9.2409e+01, 1.0009e+02, 1.0777e+02, 1.1545e+02, 1.2312e+02,
     1.3079e+02, 1.3846e+02, 1.4613e+02, 1.5379e+02, 1.6145e+02, 1.6911e+02,
     1.7677e+02, 1.8442e+02, 1.9207e+02, 1.9971e+02, 2.0735e+02, 2.1499e+02,
     2.2263e+02, 2.3027e+02, 2.3790e+02, 2.4552e+02, 2.5315e+02, 2.6077e+02,
     2.6839e+02, 2.7600e+02, 2.8361e+02, 2.9122e+02, 2.9882e+02, 3.0642e+02,
     3.1401e+02, 3.2160e+02, 3.2918e+02, 3.3676e+02, 3.4433e+02, 3.5190e+02,
     3.5946e+02, 3.6701e+02, 3.7455e+02, 3.8209e+02, 3.8961e+02, 3.9712e+02,
     4.0462e+02, 4.1211e+02, 4.1958e+02, 4.2703e+02, 4.3447e+02, 4.4188e+02,
     4.4926e+02, 4.5661e+02, 4.6393e+02, 4.7122e+02, 4.7846e+02, 4.8565e+02,
     4.9278e+02, 4.9985e+02, 5.0685e+02, 5.1376e+02, 5.2057e+02, 5.2728e+02,
     5.3386e+02, 5.4029e+02, 5.4656e+02, 5.5265e+02, 5.5852e+02, 5.6415e+02,
     5.6950e+02, 5.7453e+02, 5.7920e+02, 5.8347e+02, 5.8727e+02, 5.9056e+02,
     5.9325e+02, 5.9527e+02, 5.9654e+02, 5.9697e+02, 5.9646e+02, 5.9490e+02,
     5.9217e+02, 5.9175e+02, 5.9419e+02, 5.9665e+02, 5.9790e+02, 6.0049e+02,
     6.0309e+02, 6.0410e+02, 6.0748e+02, 6.1034e+02, 6.1467e+02, 6.1658e+02,
     6.2282e+02, 6.2905e+02, 6.3528e+02, 6.4151e+02, 6.4772e+02, 6.5393e+02,
     6.6013e+02, 6.6632e+02, 6.7251e+02, 6.7868e+02, 6.8484e+02, 6.9099e+02,
     6.9712e+02, 7.0323e+02, 7.0931e+02, 7.1536e+02, 7.2137e+02, 7.2732e+02,
     7.3320e+02, 7.3899e+02, 7.4464e+02, 7.5013e+02, 7.5540e+02, 7.6039e+02,
     7.6502e+02, 7.6922e+02, 7.7287e+02, 7.7589e+02, 7.7817e+02, 7.7962e+02,
     7.8014e+02, 7.8039e+02, 7.8250e+02, 7.8464e+02, 7.8598e+02, 7.8823e+02,
     7.9050e+02, 7.9166e+02, 7.9458e+02, 7.9739e+02, 8.0109e+02, 8.0313e+02,
     8.0793e+02, 8.0888e+02, 8.1359e+02, 8.1462e+02, 8.1978e+02, 8.2036e+02,
     8.2330e+02, 8.2610e+02, 8.3183e+02, 8.3755e+02, 8.4326e+02, 8.4897e+02,
     8.5466e+02, 8.6035e+02, 8.6602e+02, 8.7168e+02, 8.7732e+02, 8.8295e+02,
     8.8855e+02, 8.9412e+02, 8.9965e+02, 9.0513e+02, 9.1055e+02, 9.1588e+02,
     9.2110e+02, 9.2618e+02, 9.3108e+02, 9.3576e+02, 9.4015e+02, 9.4420e+02,
     9.4784e+02, 9.5100e+02, 9.5362e+02, 9.5563e+02, 9.5698e+02, 9.5761e+02,
     9.5746e+02, 9.5650e+02, 9.5468e+02, 9.5195e+02, 9.4828e+02, 9.4363e+02,
     9.3796e+02, 9.3122e+02, 9.2337e+02, 9.1437e+02, 9.0418e+02, 8.9275e+02,
     8.8004e+02, 8.6600e+02, 8.5059e+02, 8.3376e+02, 8.1546e+02, 7.9566e+02,
     7.7430e+02, 7.5134e+02, 7.2674e+02, 7.0046e+02, 6.7244e+02, 6.4266e+02,
     6.1108e+02, 5.7765e+02, 5.4234e+02, 5.0512e+02, 4.6596e+02, 4.2483e+02,
     3.8170e+02, 3.3654e+02, 6.8800e-05, 5.1500e-05, 4.8000e-05, 4.7300e-05,
     4.7200e-05, 4.7200e-05, 4.7200e-05, 4.7200e-05, 1.5520e-04]

Any extremum is such that the derivative at the extremum is zero. As we do not have an analytic expression for the data, the next best thing we can do is approximate the derivative. This is essentially the same as taking the 1-step difference and looking for those values that are 'small'.

The following works well for me,

def find_extrema(frame, tolerance=0.5):
    diff = frame.diff()

    extrema = diff[np.abs(diff) < tolerance]

    return extrema[~np.isnan(extrema.y)]


df = pd.DataFrame(dict(y=y), index=x)

candidates = find_extrema(df)

print(candidates)

And I find,

                      y
0.10380    2.230000e-03
0.10430    2.230000e-03
0.10680    1.114000e-02
0.11932    5.572000e-02
0.18192    2.785800e-01
1.74380    2.533000e-01
149.21000  4.300000e-01
156.15000 -4.200000e-01
218.61000  2.500000e-01
282.80000 -1.500000e-01
346.86000 -1.730000e-05
346.86000 -3.500000e-06
346.86000 -7.000000e-07
346.86000 -1.000000e-07
346.86000  0.000000e+00
346.86000  0.000000e+00
346.86000  0.000000e+00
346.87000  1.080000e-04

This will require some cleaning still (mostly on the edges), but the general idea should hopefully be clear to you.

The following plot was made with,

tolerance = 0.75

diff = df.diff()

ax = diff[np.abs(diff) < tolerance].y.plot(
     title="Derivative approximation for tolerance = {0}".format(tolerance))

ax.set_xlabel("x")
ax.set_ylabel("y[x] - y[x - 1]")

plt.show()

(notice the larger tolerance, so we can actually observe some lines rather than just points)

极值

You can also use the np.gradient function and look where the gradient changes sign:

z = np.gradient(y, x)
i = 0
while i < len(x)-2:
if (z[i]*z[i+2]<=0 and z[i]>0): #gradient changes sign > optima, and point previous to optima has a positive slope
        print(i+1, x[i+1], y[i+1])
        i = i+1
    i+=1

plt.ylim(-1, 1)
plt.plot(x, z)

Looing at the plot, it seems the point at around 210 is not a maxima (the gradient doesnt reach zero). You can check this by replacing the if statement with the following if (y[i+1]>y[i] and y[i+1]>y[i+2]):

Here comes my naive approach:

Step 1: find a list containing slopes, which is +1 if two consecutive y -values are increasing, -1 if decreasing and 0 if are the same:

import numpy as np
slope = [np.sign(y[i]-y[i-1]) for i in range(1, len(y))]

Now if you print slope , it's gonna be just either 0,1,-1 which says about slopes between each two consecutive y points.

Step2: To find minimas and maximas , I wrote this code which evaluates if the slope changes or not. If it changes from 1 to -1 the index will be saved as a maxima , otherwise as minima .

x_prev = slope[0]
optima_dic={'minima':[], 'maxima':[]}
for i in range(1, len(slope)):
    if slope[i]*x_prev==-1: #slope changed
        if x_prev==1: # slope changed from 1 to -1
            optima_dic['maxima'].append(i)
        else: # slope changed from -1 to 1
            optima_dic['minima'].append(i)
        x_prev=-x_prev

and if you print the results:

print(optima_dic)

Output:

{'minima': [109, 237], 'maxima': [105, 197]}

Quick and dirty :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM