如何将数组列表转换为单个多维 numpy 数组？

Question

I am trying to extract features from.wav files by using MFCC's extracted from wav files.我正在尝试使用从 wav 文件中提取的 MFCC 从 .wav 文件中提取特征。

I'm having trouble converting my list of MFCC's to a numpy array.我在将 MFCC 列表转换为 numpy 数组时遇到问题。 From my understadning, the error is due to the MFCC's within the MFCC list being the same dimensions, however I'm not sure of the best way to resolve this.根据我的理解，错误是由于 MFCC 列表中的 MFCC 具有相同的尺寸，但是我不确定解决此问题的最佳方法。

When running this code below:在下面运行此代码时：

X = []
y = []
    _min, _max = float('inf'), -float('inf')  
    for _ in tqdm(range(len(new_dataset))):  
        rand_class = np.random.choice(class_distribution.index, p=prob_distribution)     
        file = np.random.choice(new_dataset[new_dataset.word == rand_class].index)  
        label = new_dataset.at[file, 'word']   
        X_sample = new_dataset.at[file,'coeff']                
        _min = min(np.amin(X_sample), _min)                     
        _max = max(np.amin(X_sample), _max)
        X.append(X_sample if config.mode == 'conv' else X_sample.T)                                                  
        y.append(classes.index(label))     
    X, y = np.array(X), np.array(y)     #crashes here

I get the following error message:我收到以下错误消息：

Traceback (most recent call last):

  File "<ipython-input-150-8689abab6bcf>", line 14, in <module>
    X, y = np.array(X), np.array(y)

ValueError: could not broadcast input array from shape (13,97) into shape (13)

adding print(X_sample.shape) in the loop produces:在循环中添加 print(X_sample.shape) 会产生：

:
(13, 74)
(13, 83)
(13, 99)
(13, 99)
(13, 99)
(13, 55)
(13, 92)
(13, 99)
(13, 99)
(13, 78)
...

From checking, it seems as MFCC's don't all have the same shape as the recordings are not all the same length.从检查来看，似乎 MFCC 的形状并不完全相同，因为录音的长度也不尽相同。

I'd like to know if I'm correct in my assumption that this is the issue, if so how do I fix this issue?If this isn't the issue then I'd equally like to know the solution.我想知道我认为这是问题所在的假设是否正确，如果是这样，我该如何解决这个问题？如果这不是问题所在，那么我同样想知道解决方案。

Thanks in advance!提前致谢！

Answer 1

This reproduces your error:这重现了您的错误：

In [186]: np.array([np.zeros((4,5)),np.ones((4,6))])                            
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-186-e369332b8a05> in <module>
----> 1 np.array([np.zeros((4,5)),np.ones((4,6))])

ValueError: could not broadcast input array from shape (4,5) into shape (4)

If the arrays all have the same shape:如果数组都具有相同的形状：

In [187]: np.array([np.zeros((4,6)),np.ones((4,6))]).shape                      
Out[187]: (2, 4, 6)

If one or more differs in the first dimension, we get an object dtype array, essentially an array wrapper around the list:如果第一维有一个或多个不同，我们会得到一个对象 dtype 数组，本质上是列表周围的数组包装器：

In [188]: np.array([np.zeros((4,6)),np.ones((3,6))]).shape                      
Out[188]: (2,)

Don't try to combine arrays that (may) differ in shape unless you understand what you need, and what you intend to do with the result.不要尝试组合形状（可能）不同的数组，除非您了解您需要什么，以及您打算如何处理结果。 It is possible to make an object dtype array with the first case, but construction process is a bit roundabout.第一种情况可以创建一个对象 dtype 数组，但构建过程有点迂回。 I won't go into that unless you really such an array.除非你真的有这样一个数组，否则我不会深入探讨。

Answer 2

You will need to truncate or pad the time dimension in order to make it into arrays of the same size.您将需要截断或填充时间维度以使其成为相同大小的数组。 If you have very varying lengths, you can use a fixed length analysis windows (say over 1 or 10 seconds of MFCCs) and have multiple of these per input audio clip.如果长度变化很大，您可以使用固定长度的分析窗口（比如超过 1 秒或 10 秒的 MFCC），并且每个输入音频剪辑有多个这样的窗口。 This principle is shown here, How to use a context window to segment a whole log Mel-spectrogram (ensuring the same number of segments for all the audios)?此处显示了此原理，如何使用上下文窗口对整个日志 Mel 频谱图进行分段（确保所有音频的分段数相同）？

Answer 3

This reproduces your error:这重现了您的错误：

In [186]: np.array([np.zeros((4,5)),np.ones((4,6))])                            
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-186-e369332b8a05> in <module>
----> 1 np.array([np.zeros((4,5)),np.ones((4,6))])

ValueError: could not broadcast input array from shape (4,5) into shape (4)

If the arrays all have the same shape:如果数组都具有相同的形状：

In [187]: np.array([np.zeros((4,6)),np.ones((4,6))]).shape                      
Out[187]: (2, 4, 6)

If one or more differs in the first dimension, we get an object dtype array, essentially an array wrapper around the list:如果第一维有一个或多个不同，我们会得到一个对象 dtype 数组，本质上是列表周围的数组包装器：

In [188]: np.array([np.zeros((4,6)),np.ones((3,6))]).shape                      
Out[188]: (2,)

The first case works if we do:如果我们这样做，第一种情况就有效：

In [189]: arr = np.zeros(2,object)                                              
In [190]: arr[:] = [np.zeros((4,5)),np.ones((4,6))]                             
In [191]: arr                                                                   
Out[191]: 
array([array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]]),
       array([[1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.]])], dtype=object)

如何将数组列表转换为单个多维 numpy 数组？

问题描述

3 个解决方案

解决方案1
1 2019-11-29 05:03:56

解决方案2
1 2019-12-06 22:35:26

解决方案3
0 2019-11-29 05:17:01

如何将数组列表转换为单个多维 numpy 数组？

问题描述

3 个解决方案

解决方案1 1 2019-11-29 05:03:56

解决方案2 1 2019-12-06 22:35:26

解决方案3 0 2019-11-29 05:17:01

解决方案1
1 2019-11-29 05:03:56

解决方案2
1 2019-12-06 22:35:26

解决方案3
0 2019-11-29 05:17:01