Numpy：具有積分限制的數值積分

Question

我已經測量了我想在某個范圍內積分的峰值。

我想要整合的數據是具有波數和強度的 numpy 數組的形式：

peakQ1_2500_smoothened =
array([[ 1.95594400e+04, -3.70074342e-17,  3.26000000e+00],
       [ 1.95594500e+04,  1.66666667e-03,  4.81500000e+00],
       [ 1.95594600e+04,  2.83333333e-02,  4.80833333e+00],
       [ 1.95594700e+04,  1.33333333e-02,  4.82166667e+00],
       [ 1.95594800e+04,  5.00000000e-03,  4.92416667e+00],
       [ 1.95594900e+04,  5.55555556e-04,  4.99305556e+00],
       [ 1.95595100e+04, -7.77777778e-03,  5.03972222e+00],
       [ 1.95595200e+04, -5.55555556e-03,  4.96888889e+00],
       [ 1.95595300e+04, -1.77777778e-02,  4.91333333e+00],
       [ 1.95595400e+04,  1.38888889e-02,  4.82500000e+00],
       [ 1.95595500e+04,  7.05555556e-02,  4.85722222e+00],
       [ 1.95595600e+04,  1.43888889e-01,  4.86638889e+00],
       [ 1.95595700e+04,  1.98888889e-01,  4.85138889e+00],
       [ 1.95595800e+04,  2.84444444e-01,  4.90694444e+00],
       [ 1.95595900e+04,  4.64444444e-01,  4.93611111e+00],
       [ 1.95596000e+04,  6.61111111e-01,  4.98166667e+00],
       [ 1.95596100e+04,  9.61666667e-01,  4.96722222e+00],
       [ 1.95596200e+04,  1.23222222e+00,  4.94388889e+00],
       [ 1.95596400e+04,  1.43555556e+00,  5.02166667e+00],
       [ 1.95596500e+04,  1.53222222e+00,  5.00500000e+00],
       [ 1.95596600e+04,  1.59833333e+00,  5.03666667e+00],
       [ 1.95596700e+04,  1.66388889e+00,  4.94555556e+00],
       [ 1.95596800e+04,  1.60111111e+00,  4.92777778e+00],
       [ 1.95596900e+04,  1.42333333e+00,  4.94666667e+00],
       [ 1.95597000e+04,  1.14111111e+00,  5.00777778e+00],
       [ 1.95597100e+04,  9.52222222e-01,  5.08555556e+00],
       [ 1.95597200e+04,  7.25555556e-01,  5.09222222e+00],
       [ 1.95597300e+04,  5.80555556e-01,  5.08055556e+00],
       [ 1.95597400e+04,  3.92777778e-01,  5.09611111e+00],
       [ 1.95597500e+04,  2.43222222e-01,  5.01655556e+00],
       [ 1.95597600e+04,  1.36555556e-01,  4.99822222e+00],
       [ 1.95597700e+04,  6.32222222e-02,  4.87044444e+00],
       [ 1.95597800e+04,  3.88888889e-02,  4.91944444e+00],
       [ 1.95597900e+04,  3.22222222e-02,  4.93611111e+00],
       [ 1.95598000e+04,  2.44444444e-02,  5.10277778e+00],
       [ 1.95598100e+04,  5.11111111e-02,  5.11277778e+00],
       [ 1.95598200e+04,  4.44444444e-02,  5.21944444e+00],
       [ 1.95598300e+04,  4.33333333e-02,  5.05333333e+00],
       [ 1.95598400e+04,  3.58333333e-02,  5.08750000e+00],
       [ 1.95598500e+04,  7.50000000e-03,  5.12750000e+00],
       [ 1.95598600e+04,  4.16666667e-03,  5.22916667e+00],
       [ 1.95598800e+04, -1.33333333e-02,  3.51000000e+00]])

我發現我可以對整個數組進行集成：

def integratePeak(yvals, xvals):
    I = np.trapz(yvals, x = xvals)
    return I

但是如何與 x 限制進行集成，例如從 19559.52 到 19559.78？

def integratePeak(yvals, xvals, xlower, xupper):
    '''integrate y over x from xlower to xupper'''
    return I

我當然可以通過將數組元素顯式引用為peakQ1_2500_smoothened[7:33,0]和peakQ1_2500_smoothened[7:33,1]來給出 x 和 y 值，但顯然我不想引用數組元素而是定義積分限制為波數，因為不同的測量峰具有不同的陣列長度。

將每個波數減少到一個數據點然后取運行平均值的函數：

def averagePerWavenumber(data):
    wavenum, intensity, power = data[:,0], data[:,1], data[:,2]
    wavenum_unique, intensity_mean = npi.group_by(wavenum).mean(intensity)
    wavenum_unique, power_mean = npi.group_by(wavenum).mean(power)
    output = np.zeros(shape=(len(wavenum_unique), 3))
    output[:,0] = wavenum_unique
    output[:,1] = intensity_mean
    output[:,2] = power_mean
    return output

def smoothening(data, bins):
    output = np.zeros(shape=(len(data[:,0]), 3))
    output[:,0] = data[:,0]
    output[:,1] = np.convolve(data[:,1], np.ones(bins), mode='same') / bins
    output[:,2] = np.convolve(data[:,2], np.ones(bins), mode='same') / bins
    return output

Answer 1

def integratePeak(yvals, xvals, xlower, xupper):
    '''integrate y over x from xlower to xupper.

    Use trapz to integrate over points closest to xlower, xupper.
    
    the +1 to idx_max is for numpy half-open indexing.
    '''
    idx_min = np.argmin(np.abs(xvals - xlower))
    idx_max = np.argmin(np.abs(xvals - xupper)) + 1
    result = np.trapz(yvals[idx_min:idx_max], x=xvals[idx_min:idx_max])
    return result

順便說一句，您可能會從將 Pandas 用於表格數據中受益 - 它與 numpy 數組可以很好地互操作，最重要的是讓您可以標記數據：

import pandas as pd
df = pd.DataFrame(peakQ1_2500_smoothened, columns=["wave_num", "intensity", "col3"])

integratePeak(yvals=df.intensity, xvals=df.wave_num, xlower=19559.52, xupper=19559.78)

# 0.18853555549577536

Answer 2

讓我們先看看np.trapz實際上做了什么。 第i個梯形的面積是平均高度乘以寬度： 0.5 * (y[i + 1] + y[i]) * (x[i + 1] - x[i]) 。 如果你有一個固定的dx而不是x數組，最后一項只是一個標量。 所以讓我們重寫你的第一個函數：

def integrate_peak0(y, x):
    """ x can be array of same size as y or a scalar """
    dx = x if x.size <= 1 else np.diff(x)
    return np.sum(0.5 * (y[1:] + y[:-1]) * dx)

現在最困難的部分是插入積分的極限。 由於x已排序，您可以使用np.searchsorted將限制轉換為索引到數據：

limits = np.array([xlower, xupper])
indices = np.searchsorted(x, limits)

如果限制始終落在x精確值上，則可以直接使用indices ：

def integrate_peak1(y, x, xlower, xupper):
    indices = np.searchsorted(x, [xlower, xupper])
    s = slice(indices[0], indices[1] + 1)
    return np.trapz(y[s], x[s])

由於這種情況幾乎不會發生，您可以嘗試下一個最簡單的方法：四舍五入到最接近的值。 您可以使用花哨的索引為每個潛在的邊界獲取一個二維數組，您可以將np.argmin應用於：

candidates = x[np.stack((indices - 1, indices), axis=0)]
offset = np.abs(candidates - limits).argmin(axis=0) - 1
indices += offset

candidates是一個 2x2 數組，列代表每個邊界的候選者，行代表較小和較大的候選者。 offset將是您需要修改索引以獲得最近鄰居的數量。 這是一個積分器版本，它根據積分限制選擇最近的 bin：

def integrate_peak2(y, x, xlower, xupper):
    limits = np.array([xlower, xupper])
    indices = np.searchsorted(x, limits)
    candidates = x[np.stack((indices - 1, indices), axis=0)]
    indices += np.abs(candidates - limits).argmin(axis=0) - 1

    s = slice(indices[0], indices[1] + 1)
    return np.trapz(y[s], x[s])

最終版本是基於x插入y的值。 此版本可以通過以下兩種方式之一實現。 您可以計算目標 y 值並使用適當的x將它們傳遞給np.trapz ，或者您可以使用在integrate_peak0定義的函數自己執行操作。

給定一個元素x[i] < xn <= x[i + 1] ，你可以估計yn = y[i] + (y[i + 1] - y[i]) * (x[n] - x[i]) / (x[i + 1] - x[i]) 。 這里， x[i]和x[i + 1]是上面顯示的candidates值。 y[i]和y[i + 1]是y的對應元素。 xn是limits 。 因此，您可以通過幾種不同的方式計算插值。

一種方法是將輸入調整為trapz ：

def integrate_peak3a(y, x, xlower, xupper):
    limits = np.array([xlower, xupper])
    indices = np.searchsorted(x, limits)
    indices = np.stack((indices - 1, indices), axis=0)
    xi = x[indices]
    yi = y[indices]
    yn = yi[0] + np.diff(yi, axis=0) * (limits - xi[0]) / np.diff(xi, axis=0)

    indices = indices[[1, 0], [0, 1]]
    s = slice(indices[0], indices[1] + 1)
    return np.trapz(np.r_[yn[0, 0], y[s], yn[0, 1]], np.r_[xlower, x[s], xupper])

另一種方法是手動計算邊緣片段的總和：

def integrate_peak3b(y, x, xlower, xupper):
    limits = np.array([xlower, xupper])
    indices = np.searchsorted(x, limits)
    indices = np.stack((indices - 1, indices), axis=0)
    xi = x[indices]
    yi = y[indices]
    yn = yi[0] + np.diff(yi, axis=0) * (limits - xi[0]) / np.diff(xi, axis=0)

    indices = indices[[1, 0], [0, 1]]
    s = slice(indices[0], indices[1] + 1)
    return np.trapz(y[s], x[s]) - 0.5 * np.diff((yn + y[indices]) * (x[indices] - limits))

當然，您可以通過在integrate_peak3a的“手動”計算在integrate_peak0運行np.trapz的輸入。

在所有這些情況下，檢查積分限制是否在可接受的范圍內並以正確的順序作為練習留給讀者。

Numpy：具有積分限制的數值積分

問題描述

2 個解決方案

解決方案1
1 2020-11-10 21:49:16

解決方案2
1 已采納 2020-11-10 22:57:41

Numpy：具有積分限制的數值積分

問題描述

2 個解決方案

解決方案1 1 2020-11-10 21:49:16

解決方案2 1 已采納 2020-11-10 22:57:41

解決方案1
1 2020-11-10 21:49:16

解決方案2
1 已采納 2020-11-10 22:57:41