如何將數組列表轉換為單個多維 numpy 數組？

Question

我正在嘗試使用從 wav 文件中提取的 MFCC 從 .wav 文件中提取特征。

我在將 MFCC 列表轉換為 numpy 數組時遇到問題。 根據我的理解，錯誤是由於 MFCC 列表中的 MFCC 具有相同的尺寸，但是我不確定解決此問題的最佳方法。

在下面運行此代碼時：

X = []
y = []
    _min, _max = float('inf'), -float('inf')  
    for _ in tqdm(range(len(new_dataset))):  
        rand_class = np.random.choice(class_distribution.index, p=prob_distribution)     
        file = np.random.choice(new_dataset[new_dataset.word == rand_class].index)  
        label = new_dataset.at[file, 'word']   
        X_sample = new_dataset.at[file,'coeff']                
        _min = min(np.amin(X_sample), _min)                     
        _max = max(np.amin(X_sample), _max)
        X.append(X_sample if config.mode == 'conv' else X_sample.T)                                                  
        y.append(classes.index(label))     
    X, y = np.array(X), np.array(y)     #crashes here

我收到以下錯誤消息：

Traceback (most recent call last):

  File "<ipython-input-150-8689abab6bcf>", line 14, in <module>
    X, y = np.array(X), np.array(y)

ValueError: could not broadcast input array from shape (13,97) into shape (13)

在循環中添加 print(X_sample.shape) 會產生：

:
(13, 74)
(13, 83)
(13, 99)
(13, 99)
(13, 99)
(13, 55)
(13, 92)
(13, 99)
(13, 99)
(13, 78)
...

從檢查來看，似乎 MFCC 的形狀並不完全相同，因為錄音的長度也不盡相同。

我想知道我認為這是問題所在的假設是否正確，如果是這樣，我該如何解決這個問題？如果這不是問題所在，那么我同樣想知道解決方案。

提前致謝！

Answer 1

這重現了您的錯誤：

In [186]: np.array([np.zeros((4,5)),np.ones((4,6))])                            
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-186-e369332b8a05> in <module>
----> 1 np.array([np.zeros((4,5)),np.ones((4,6))])

ValueError: could not broadcast input array from shape (4,5) into shape (4)

如果數組都具有相同的形狀：

In [187]: np.array([np.zeros((4,6)),np.ones((4,6))]).shape                      
Out[187]: (2, 4, 6)

如果第一維有一個或多個不同，我們會得到一個對象 dtype 數組，本質上是列表周圍的數組包裝器：

In [188]: np.array([np.zeros((4,6)),np.ones((3,6))]).shape                      
Out[188]: (2,)

不要嘗試組合形狀（可能）不同的數組，除非您了解您需要什么，以及您打算如何處理結果。 第一種情況可以創建一個對象 dtype 數組，但構建過程有點迂回。 除非你真的有這樣一個數組，否則我不會深入探討。

Answer 2

您將需要截斷或填充時間維度以使其成為相同大小的數組。 如果長度變化很大，您可以使用固定長度的分析窗口（比如超過 1 秒或 10 秒的 MFCC），並且每個輸入音頻剪輯有多個這樣的窗口。 此處顯示了此原理，如何使用上下文窗口對整個日志 Mel 頻譜圖進行分段（確保所有音頻的分段數相同）？

Answer 3

這重現了您的錯誤：

In [186]: np.array([np.zeros((4,5)),np.ones((4,6))])                            
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-186-e369332b8a05> in <module>
----> 1 np.array([np.zeros((4,5)),np.ones((4,6))])

ValueError: could not broadcast input array from shape (4,5) into shape (4)

如果數組都具有相同的形狀：

In [187]: np.array([np.zeros((4,6)),np.ones((4,6))]).shape                      
Out[187]: (2, 4, 6)

如果第一維有一個或多個不同，我們會得到一個對象 dtype 數組，本質上是列表周圍的數組包裝器：

In [188]: np.array([np.zeros((4,6)),np.ones((3,6))]).shape                      
Out[188]: (2,)

如果我們這樣做，第一種情況就有效：

In [189]: arr = np.zeros(2,object)                                              
In [190]: arr[:] = [np.zeros((4,5)),np.ones((4,6))]                             
In [191]: arr                                                                   
Out[191]: 
array([array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]]),
       array([[1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.]])], dtype=object)

如何將數組列表轉換為單個多維 numpy 數組？

問題描述

3 個解決方案

解決方案1
1 2019-11-29 05:03:56

解決方案2
1 2019-12-06 22:35:26

解決方案3
0 2019-11-29 05:17:01

如何將數組列表轉換為單個多維 numpy 數組？

問題描述

3 個解決方案

解決方案1 1 2019-11-29 05:03:56

解決方案2 1 2019-12-06 22:35:26

解決方案3 0 2019-11-29 05:17:01

解決方案1
1 2019-11-29 05:03:56

解決方案2
1 2019-12-06 22:35:26

解決方案3
0 2019-11-29 05:17:01