简体   繁体   English

重采样,插值矩阵

[英]resampling, interpolating matrix

I'm trying to interpolate some data for the purpose of plotting. 我正在尝试为绘图目的插入一些数据。 For instance, given N data points, I'd like to be able to generate a "smooth" plot, made up of 10*N or so interpolated data points. 例如,给定N个数据点,我希望能够生成“平滑”图,由10 * N左右的内插数据点组成。

My approach is to generate an N-by-10*N matrix and compute the inner product the original vector and the matrix I generated, yielding a 1-by-10*N vector. 我的方法是生成N×10 * N矩阵并计算原始向量和我生成的矩阵的内积,得到1乘10 * N向量。 I've already worked out the math I'd like to use for the interpolation, but my code is pretty slow. 我已经计算出我想用于插值的数学运算,但我的代码非常慢。 I'm pretty new to Python, so I'm hopeful that some of the experts here can give me some ideas of ways I can try to speed up my code. 我对Python很陌生,所以我希望这里的一些专家可以给我一些关于如何加速我的代码的想法。

I think part of the problem is that generating the matrix requires 10*N^2 calls to the following function: 我认为问题的一部分是生成矩阵需要10 * N ^ 2次调用以下函数:

def sinc(x):
    import math
    try:
        return math.sin(math.pi * x) / (math.pi * x)
    except ZeroDivisionError:
        return 1.0

(This comes from sampling theory . Essentially, I'm attempting to recreate a signal from its samples, and upsample it to a higher frequency.) (这来自采样理论 。基本上,我试图从其样本中重新生成信号,并将其上采样到更高的频率。)

The matrix is generated by the following: 矩阵由以下生成:

def resampleMatrix(Tso, Tsf, o, f):
    from numpy import array as npar
    retval = []

    for i in range(f):
        retval.append([sinc((Tsf*i - Tso*j)/Tso) for j in range(o)])

    return npar(retval)

I'm considering breaking up the task into smaller pieces because I don't like the idea of an N^2 matrix sitting in memory. 我正在考虑将任务分解成更小的部分因为我不喜欢坐在内存中的N ^ 2矩阵的想法。 I could probably make 'resampleMatrix' into a generator function and do the inner product row-by-row, but I don't think that will speed up my code much until I start paging stuff in and out of memory. 我可以将'resampleMatrix'变成一个生成器函数并逐行执行内部产品,但我不认为这会加速我的代码,直到我开始在内存中分页内容。

Thanks in advance for your suggestions! 提前感谢您的建议!

This is upsampling. 这是上采样。 See Help with resampling/upsampling for some example solutions. 有关示例解决方案,请参阅重新取样/上采样的帮助

A fast way to do this (for offline data, like your plotting application) is to use FFTs. 快速实现此目的(对于离线数据,如绘图应用程序)是使用FFT。 This is what SciPy's native resample() function does. 这就是SciPy的原生resample()函数所做的。 It assumes a periodic signal, though, so it's not exactly the same . 但它假定是一个周期性信号, 所以它并不完全相同 See this reference : 看到这个参考

Here's the second issue regarding time-domain real signal interpolation, and it's a big deal indeed. 这是关于时域实信号插值的第二个问题,确实很重要。 This exact interpolation algorithm provides correct results only if the original x(n) sequence is periodic within its full time interval. 只有当原始x(n)序列在其全时间间隔内是周期性时,这种精确的插值算法才能提供正确的结果。

Your function assumes the signal's samples are all 0 outside of the defined range, so the two methods will diverge away from the center point. 您的函数假定信号的样本在定义的范围之外都是0,因此这两种方法将偏离中心点。 If you pad the signal with lots of zeros first, it will produce a very close result. 如果先用大量​​的零填充信号,它将产生非常接近的结果。 There are several more zeros past the edge of the plot not shown here: 在这里没有显示的情节边缘还有几个零:

在此输入图像描述

Cubic interpolation won't be correct for resampling purposes. 对于重采样目的,立方插值不正确。 This example is an extreme case (near the sampling frequency), but as you can see, cubic interpolation isn't even close. 这个例子是一个极端情况(接近采样频率),但正如你所看到的,三次插值甚至不是很接近。 For lower frequencies it should be pretty accurate. 对于较低的频率,它应该非常准确。

If you want to interpolate data in a quite general and fast way, splines or polynomials are very useful. 如果要以非常通用且快速的方式插入数据,样条或多项式非常有用。 Scipy has the scipy.interpolate module, which is very useful. Scipy有scipy.interpolate模块,非常有用。 You can find many examples in the official pages. 您可以在官方页面中找到许多示例

Your question isn't entirely clear; 你的问题并不完全清楚; you're trying to optimize the code you posted, right? 你正试图优化你发布的代码,对吧?

Re-writing sinc like this should speed it up considerably. 像这样重写sinc应该会大大加快速度。 This implementation avoids checking that the math module is imported on every call, doesn't do attribute access three times, and replaces exception handling with a conditional expression: 此实现避免检查是否在每次调用时导入数学模块,不执行三次属性访问,并使用条件表达式替换异常处理:

from math import sin, pi
def sinc(x):
    return (sin(pi * x) / (pi * x)) if x != 0 else 1.0

You could also try avoiding creating the matrix twice (and holding it twice in parallel in memory) by creating a numpy.array directly (not from a list of lists): 你也可以尝试通过直接创建一个numpy.array(而不是列表列表)来避免创建矩阵两次(并在内存中并行保存两次):

def resampleMatrix(Tso, Tsf, o, f):
    retval = numpy.zeros((f, o))
    for i in xrange(f):
        for j in xrange(o):
            retval[i][j] = sinc((Tsf*i - Tso*j)/Tso)
    return retval

(replace xrange with range on Python 3.0 and above) (在Python 3.0及更高版本上用x替换xrange)

Finally, you can create rows with numpy.arange as well as calling numpy.sinc on each row or even on the entire matrix: 最后,您可以使用numpy.arange创建行,也可以在每行甚至整个矩阵上调用numpy.sinc:

def resampleMatrix(Tso, Tsf, o, f):
    retval = numpy.zeros((f, o))
    for i in xrange(f):
        retval[i] = numpy.arange(Tsf*i / Tso, Tsf*i / Tso - o, -1.0)
    return numpy.sinc(retval)

This should be significantly faster than your original implementation. 这应该比原始实现快得多。 Try different combinations of these ideas and test their performance, see which works out the best! 尝试不同的这些想法的组合,并测试他们的表现,看看哪个最好!

Here's a minimal example of 1d interpolation with scipy -- not as much fun as reinventing, but. 这里是scipy的1d插值的最小例子 - 不像重新发明那么有趣,但是。
The plot looks like sinc , which is no coincidence: try google spline resample "approximate sinc". 情节看起来像sinc ,这不是巧合:尝试google spline resample“近似sinc”。
(Presumably less local / more taps ⇒ better approximation, but I have no idea how local UnivariateSplines are.) (可能是更少的本地/更多的点击⇒更好的近似,但我不知道当地的单变量线是如何。)

""" interpolate with scipy.interpolate.UnivariateSpline """
from __future__ import division
import numpy as np
from scipy.interpolate import UnivariateSpline
import pylab as pl

N = 10 
H = 8
x = np.arange(N+1)
xup = np.arange( 0, N, 1/H )
y = np.zeros(N+1);  y[N//2] = 100

interpolator = UnivariateSpline( x, y, k=3, s=0 )  # s=0 interpolates
yup = interpolator( xup )
np.set_printoptions( 1, threshold=100, suppress=True )  # .1f
print "yup:", yup

pl.plot( x, y, "green",  xup, yup, "blue" )
pl.show()

Added feb 2010: see also basic-spline-interpolation-in-a-few-lines-of-numpy 添加了feb 2010:另请参阅基本样条插值 - 几个线条的numpy

I'm not quite sure what you're trying to do, but there are some speedups you can do to create the matrix. 我不太确定你要做什么,但是你可以做一些加速来创建矩阵。 Braincore's suggestion to use numpy.sinc is a first step, but the second is to realize that numpy functions want to work on numpy arrays, where they can do loops at C speen, and can do it faster than on individual elements. Braincore建议使用numpy.sinc是第一步,但第二步是要意识到numpy函数想要在numpy数组上工作,它们可以在C语言中执行循环,并且可以比单个元素更快地完成。

def resampleMatrix(Tso, Tsf, o, f):
    retval = numpy.sinc((Tsi*numpy.arange(i)[:,numpy.newaxis]
                         -Tso*numpy.arange(j)[numpy.newaxis,:])/Tso)
    return retval

The trick is that by indexing the aranges with the numpy.newaxis, numpy converts the array with shape i to one with shape ix 1, and the array with shape j, to shape 1 x j. 诀窍在于,通过使用numpy.newaxis索引aranges,numpy将具有形状i的数组转换为形状为ix 1的数组,将形状为j的数组转换为形状1 x j。 At the subtraction step, numpy will "broadcast" the each input to act as aixj shaped array and the do the subtraction. 在减法步骤中,numpy将“广播”每个输入以充当aixj形阵列并进行减法。 ("Broadcast" is numpy's term, reflecting the fact no additional copy is made to stretch the ix 1 to ix j.) (“广播”是numpy的术语,反映了没有额外的副本将ix 1拉伸到ix j的事实。)

Now the numpy.sinc can iterate over all the elements in compiled code, much quicker than any for-loop you could write. 现在numpy.sinc可以迭代编译代码中的所有元素,比你可以编写的任何for循环快得多。

(There's an additional speed-up available if you do the division before the subtraction, especially since inthe latter the division cancels the multiplication.) (如果在减法之前进行除法,则可以进行额外的加速,特别是在后者中,除法取消乘法。)

The only drawback is that you now pay for an extra Nx10*N array to hold the difference. 唯一的缺点是你现在支付额外的Nx10 * N阵列以保持差异。 This might be a dealbreaker if N is large and memory is an issue. 如果N很大并且内存是个问题,那么这可能是一个破坏者。

Otherwise, you should be able to write this using numpy.convolve . 否则,您应该能够使用numpy.convolve编写此numpy.convolve From what little I just learned about sinc-interpolation, I'd say you want something like numpy.convolve(orig,numpy.sinc(numpy.arange(j)),mode="same") . 从我刚刚学到的关于sinc-interpolation的一点点开始,我会说你想要像numpy.convolve(orig,numpy.sinc(numpy.arange(j)),mode="same") But I'm probably wrong about the specifics. 但我可能错误的具体细节。

If your only interest is to 'generate a "smooth" plot' I would just go with a simple polynomial spline curve fit: 如果您唯一的兴趣是“生成”平滑的“情节”,我将使用简单的多项式样条曲线拟合:

For any two adjacent data points the coefficients of a third degree polynomial function can be computed from the coordinates of those data points and the two additional points to their left and right (disregarding boundary points.) This will generate points on a nice smooth curve with a continuous first dirivitive. 对于任何两个相邻的数据点,可以从这些数据点的坐标和左右两个附加点(忽略边界点)计算三次多项式函数的系数。这将在一条漂亮的平滑曲线上生成点。一个连续的第一个dirivitive。 There's a straight forward formula for converting 4 coordinates to 4 polynomial coefficients but I don't want to deprive you of the fun of looking it up ;o). 将4个坐标转换为4个多项式系数有一个直接的公式,但我不想剥夺你查找它的乐趣; o)。

Small improvement. 小改进。 Use the built-in numpy.sinc(x) function which runs in compiled C code. 使用内置的numpy.sinc(x)函数,该函数在已编译的C代码中运行。

Possible larger improvement: Can you do the interpolation on the fly (as the plotting occurs)? 可能的更大改进:你可以动态插值(当绘图发生时)? Or are you tied to a plotting library that only accepts a matrix? 或者你是否只接受一个只接受矩阵的绘图库?

I recommend that you check your algorithm, as it is a non-trivial problem. 我建议您检查算法,因为这是一个非常重要的问题。 Specifically, I suggest you gain access to the article "Function Plotting Using Conic Splines" (IEEE Computer Graphics and Applications) by Hu and Pavlidis (1991). 具体来说,我建议您访问Hu和Pavlidis(1991)的文章“使用圆锥花键的功能绘图”(IEEE计算机图形和应用程序)。 Their algorithm implementation allows for adaptive sampling of the function, such that the rendering time is smaller than with regularly spaced approaches. 它们的算法实现允许对函数进行自适应采样,使得渲染时间小于规则间隔的方法。

The abstract follows: 摘要如下:

A method is presented whereby, given a mathematical description of a function, a conic spline approximating the plot of the function is produced. 提出了一种方法,其中,给定函数的数学描述,产生近似函数图的圆锥样条。 Conic arcs were selected as the primitive curves because there are simple incremental plotting algorithms for conics already included in some device drivers, and there are simple algorithms for local approximations by conics. 选择圆锥曲线作为原始曲线,因为对于已经包含在一些设备驱动器中的圆锥曲线存在简单的增量绘图算法,并且存在用于圆锥曲线局部近似的简单算法。 A split-and-merge algorithm for choosing the knots adaptively, according to shape analysis of the original function based on its first-order derivatives, is introduced. 根据基于一阶导数的原函数的形状分析,介绍了一种自适应选择节点的分裂合并算法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM