Step function analysis with python

Question

I have been bouncing some ideas around about this problem, but thought I would consult with the online community to see if someone has a better option.

So I have step-like function graphs like this:

And with this I want to compute the y-displacement between steps.

As one can see the steps are not perfectly horizontal, but rather take a small range of y-values before stepping up.

So the question is:

(1) What is the 'proper' way (if there is one) to take the average y-value of each 'level'? I am not sure where I should be using as my left most point and my right most point on each level- so I can take the values between these points and average them, and then attain the 'average' of each level, hope this makes sense. As one can see they do not all span the same displacement in x. The ultimate goal is to get the y-displacement between levels, once I have the 'average' value of each 'level' it is trivial to take the difference.

I was maybe thinking of taking the derivative of the curve and seeing where it is equal to zero for my left and right most points on each level but i'm not sure it will work as each level also contains points where (dy/dx = 0) -- so I could use some insight.

Thank you :) Oh - and this must be done in Python -- and it is not just these graphs, but a lot of them with a similar style, so the code must be generic enough to handle other step-like graphs.

Date file for graph 1: http://textuploader.com/5nwsh

Data file for graph 2: http://textuploader.com/5nwsv

Data file for graph 3: http://textuploader.com/5nwsj

Scatter Plot Python code:

import numpy as np
import matplotlib.pyplot as plt
import pylab as pl


data=np.loadtxt('data-file')
x= data[:,0]
y=data[:,1]


pl.plot(data[:,0],data[:,1],'r')

pl.xlabel('x')
pl.ylabel('y')
plt.show()

Answer 1

Let's assume your data are AFM measurements of a crystal surface (all units in m) and you want to get the step height of the crystal. The following would get you towards that.

from __future__ import division
from ipywidgets import *
import numpy as np
import matplotlib.pyplot as p
%matplotlib inline

def rotatedata(x,y,a):
    cosa=np.cos(a)
    sina=np.sin(a)
    x = x*cosa-y*sina
    y = x*sina+y*cosa
    return x,y

data=np.loadtxt('plot3.txt')
data=data.T
x,y=data[0],data[1]

def workit(a2):
    fig=p.figure(num=None, figsize=(18, 16), dpi= 80, facecolor='w', edgecolor='k')
    p.subplot(511) # , aspect='equal')
    p.plot(x,y)

    #what is the slope? 
    m,b = np.polyfit(x, y, 1)

    x1,y1=rotatedata(x,y, -np.arctan(m) )  # rotate data to have flat surface, 
                                           # interesting for surface roughness
    p.subplot(512 )
    p.plot(x1,y1)

    x2,y2=rotatedata(x,y, a2)    # rotate data to 'sharpen' histogram
    p.subplot(513 )
    p.plot(x2,y2)

    p.subplot(514)
    p.hist(y2,bins=130)        

    y3=np.diff(y2)
    p.subplot(515 )
    p.plot(y3)

    return HTML()

interact(workit,a2=[-0.002,0.002,0.00001])

The first plot is your raw data. In the 2nd plot I removed the slope of the data to show data that you would use if you cared about calculating the surface roughness.

The third plot shows the same data if they are rotated such that all slopes are horizontal.

How this is done (interactive slider) is shown in the 4th plot, which is a histogram of the rotated data. You simple move the slider (rotate the data) until the histogram has maximum sharpness (all maxima are of minimum width). I did this by hand (slider) not with a function but a simple autofocus routine, maximizing the sum of the absolute differences between neighboring values would be OK to use.

In the last plot I show the 1st derivative (redundant now, but average slope of the horizontal regions should center around zero as a double-check).

The width of the crystal layers (step height you were asking for) is now given by the distances of the maxima in the histogram.

I'll leave the actual determination of each step as an exercise (eg threshold the histogram, then calculate the center of gravity for each peak, then get differences of neighboring peak centers of gravity).

Answer 2

The 2nd step here with automatic leveling of the histogram. I define a function focus that tells me how sharp the histogram is then I use that function to find the angle that gives me the sharpest histogram. I then pick that angle, threshold the histogram and find the center of gravity of each histogram peak. The differences between those locations are the steps. If the numbers were real in meters the steps would be about 4 Angstrom.

It turns out that the same idea can be used in 2D to level AFM, or white light interferometer data and very precisely determine step heights not just of crystals, but of nanoscale coatings or etch depths.

from __future__ import division
from ipywidgets import *
import numpy as np
import matplotlib.pyplot as p
%matplotlib inline

def rotatedata(x,y,a):
    cosa=np.cos(a)
    sina=np.sin(a)
    x = x*cosa-y*sina
    y = x*sina+y*cosa
    return x,y


data=np.loadtxt('plot3.txt')
data=data.T
x,y=data[0],data[1]

def rotateAndCheck(a2):
    x2,y2=rotatedata(x,y, a2) 
    vals,edges=np.histogram(y2,bins=230)
    focus=np.sqrt(np.sum((np.diff(vals))**2))
    return focus

focus=[]
amin,amax,astep=-0.01,0.01,0.0001
for i in np.arange(amin,amax,astep):
    focus.append(rotateAndCheck(i))


fig=p.figure(num=None, figsize=(18, 16), dpi= 80, facecolor='w', edgecolor='k')


p.subplot(311)    

p.plot(focus,'.-')
nm=np.argmax(focus)
angle=amin+astep*nm


p.subplot(312)
x2,y2=rotatedata(x,y, angle) 
vals,edges,_=p.hist(y2,bins=230)


#now threshold
p.subplot(313)
vals[vals<3]=0
#print len(edges),len(vals)
deltaedge=edges[1]-edges[0]
#print deltaedge
#p.bar(edges[:-1],vals,0.05e-10)
p.bar(np.arange(len(vals)),vals,0.05e-10)
p.show()

# now you go through the histogram from left to right, identify each group and compute the center of gravity for each group
# this could get trickier if the bin size is not well chosen.

from scipy.ndimage.measurements import center_of_mass
levels=[]

for i in range(1,len(edges)-2):
    if  vals[i-1]==0 and vals[i]>0:
        istart=i
        #print 'istart: ',istart
    if  vals[i]>0 and vals[i+1]==0:
        istop=i
        #print 'istop', istop
        sum=np.sum(vals[istart:istop+1])
        c= center_of_mass(vals[istart:istop+1])[0] 
        level= edges[istart]+c*deltaedge
        levels.append(level)

        #print i, sum

print 'levels: ',levels        
print
print 'steps: ' ,np.diff(levels)

Output:

Step function analysis with python

Question

2 answers

solution1
2 ACCPTED 2016-03-31 04:12:20

solution2
1 2016-03-31 23:09:40

Step function analysis with python

Question

2 answers

solution1 2 ACCPTED 2016-03-31 04:12:20

solution2 1 2016-03-31 23:09:40

solution1
2 ACCPTED 2016-03-31 04:12:20

solution2
1 2016-03-31 23:09:40