I have two one-dimensional vectors. One contains data measured through a measurement system. The other vector contains a sort of calibration data, which are exactly the same in a 'shape' and a time (it is basically a single pulse and in both vectors those pulses are synced in time domain).
I want to match the calibration data curve to the originally measured data through a simple transformation of original_data = (calibration_data - offset) * gain
I need use 'the best approach' to find the offset and gain parameters such that the two traces look the most similar possible. For that I was thinking that the least-square scalar sum_i( (F(gain,offset)(calibration_i)-measured_i) ** 2 ) has to be minimised for the two data sets. The minimisation would be done by tweaking the gain and offset of the transformation function.
I've implemented a brute-force algorithm of this kind:
offset = 0
gain = 1.0
firstIteration = True
lastlstsq = 0
iterations = 0
for ioffset in np.arange(-32768, 32768, 50):
for igain in np.arange(1,5,0.1):
# prepare the trace by transformation:
int1 = map(lambda c: (c - ioffset) * igain, self.fetcher.yvalues['int1'])
# this is pretty heavy computation here
lstsq = sum(map(lambda c: c**2, map(sub, self.fetcher.yvalues['int0'],int1)))
if firstIteration == True:
# just store
lastlstsq = lstsq
offset = ioffset
gain = igain
firstIteration = False
else:
# what lstsq:
if lstsq < lastlstsq:
# got better match:
lastlstsq = lstsq
offset = ioffset
gain = igain
print "Iteration ", iterations, " squares=", lstsq, " offset=", offset, " gain=", gain
iterations = iterations + 1
It finds the best match, but it is way too slow and not very precise, as i'd like to find the igain with 0.01 step and ioffset in 0.5 step. For this resolution is this algorithm completely useless.
Is there any way how to solve this kind of optimisation in a pythonic way? (or is there a better approach how to find values of gain and offset to make the best match?)
Unfortunately I'm limited to numpy (no scipy), but any kind of hint is appreciated.
If the two signals are supposed to be the same shape, just y-shifted and y-scaled, you should find that
gain = std_dev(measured) / std_dev(calibration)
offset = average(calibration - (measured / gain))
If you are happy with a solution of the form
measuredData = calibration data*gain + offset
finding a solution in simply a linear regression problem. This is probably best solved using the normal equation , which will give you a fit that minimises the sum of squares error, which is what I think you are after.
Concretely, in python I guess the solution could be found using, the numpy function pinv
from numpy.linalg import pinv
from numpy import transpose, dot
pinv( dot(dot(transpose(calibrationData),calibrationData),dot(transpose(calibrationData),measuredData) );
hope this helps. Sorry I didn't have time to double check whether the code works :)
With the help of user3235916 I have managed to write down following piece of code:
import numpy as np
measuredData = np.array(yvalues['int1'])
calibrationData = np.array(yvalues['int0'])
A = np.vstack( [measuredData, np.ones(len(measuredData))]).T
gain,offset = np.linalg.lstsq(A, calibrationData)[0]
Then I could use following transformation to get the measuredData recalculated to calibrationData:
map(lambda c: c*gain+offset, measuredData)
Fits perfectly (at least visually).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.