[英]Audio Data Agmentation in python
I am using below function to augment audio data generated from wav audio files.我在下面使用 function 来增强从 wav 音频文件生成的音频数据。
def generate_augmented_data(file_path):
augmented_data = []
samples = load_wav(file_path,get_duration=False)
for time_value in [0.7, 1, 1.3]:
for pitch_value in [-1, 0, 1]:
time_stretch_data = librosa.effects.time_stretch(samples, rate=time_value)
final_data = librosa.effects.pitch_shift(time_stretch_data, sr=sample_rate, n_steps=pitch_value)
augmented_data.append(final_data)
return augmented_data
I also need to augment the class labels and facing difficulties with it.我还需要增加 class 标签并面临困难。 Tried below cod, but its not getting me the expected result
在鳕鱼下面尝试过,但它没有让我得到预期的结果
## generating augmented data.
def generate_augmented_data_label(file_path, label):
augmented_data = []
augmented_label = []
samples = load_wav(file_path,get_duration=False)
for time_value in [0.7, 1, 1.3]:
for pitch_value in [-1, 0, 1]:
time_stretch_data = librosa.effects.time_stretch(samples, rate=time_value)
final_data = librosa.effects.pitch_shift(time_stretch_data, sr=sample_rate, n_steps=pitch_value)
augmented_data.append(final_data)
augmented_label.append(label)
return augmented_data,augmented_label
Before augmentation shape for data and labels are as below,在数据和标签的增强形状如下所示之前,
X_train.reset_index(inplace=True, drop=True)
y_train.reset_index(inplace=True, drop=True)
X_train_augmented_data = []
y_train_augmented_data = []
for i in range(len(X_train)):
#print(i)
t1 = X_train.iloc[i]
t2 = y_train[i]
tmp1,tmp2 = generate_augmented_data_label(t1,t2)
#print(tmp1,tmp2)
X_train_augmented_data.append(tmp1)
y_train_augmented_data.append(tmp2)
len(X_train)
1600
len(y_train)
1600
print(len(X_train_augmented_data))
print(len(y_train_augmented_data))
After data augmentation and an additional masking step, shape is coming as在数据增强和额外的掩蔽步骤之后,形状如下
augmented_train_data_mask = []
for i in range(0,len(augmented_train_data_pad)):
augmented_train_data_mask.append(list(map(bool,augmented_train_data_pad[i])))
augmented_train_data_mask = np.array(augmented_train_data_mask)
print(augmented_train_data_pad.shape)
print(augmented_train_data_mask.shape)
(14400, 17640)
(14400, 17640)
However, label len is still 1600. Later when I pass these into an LSTM model, I am getting a shape mismatch error.但是,label len 仍然是 1600。后来当我将这些传递到 LSTM model 时,我收到了形状不匹配错误。
ValueError: Data cardinality is ambiguous:
x sizes: 14400, 14400
y sizes: 1600
Make sure all arrays contain the same number of samples.
Looking for some help to resolve this issue.寻找一些帮助来解决这个问题。
You may refer link for reference:您可以参考链接:
# https://www.geeksforgeeks.org/python-add-similar-value-multiple-times-in-list/ # https://www.geeksforgeeks.org/python-add-similar-value-multiple-times-in-list/
type(y_train)= panda series type(y_train)=熊猫系列
from itertools import repeat从 itertools 导入重复
new_label=[]新标签=[]
for index, value in y_train.items(): new_label.extend(repeat(value, 2))对于索引,y_train.items() 中的值:new_label.extend(repeat(value, 2))
len(new_label)长度(新标签)
You can use numpy repeat function to replicate your numpy array.您可以使用 numpy 重复 function 来复制您的 numpy 阵列。
ex: In: arr = np.arange(3) out: array([0, 1, 2])例如:输入:arr = np.arange(3) 输出:array([0, 1, 2])
In: arr.repeat(3) Out: array([0, 0, 0, 1, 1, 1, 2, 2, 2])输入:arr.repeat(3) 输出:array([0, 0, 0, 1, 1, 1, 2, 2, 2])
Hope this will suffice your requirement.希望这能满足您的要求。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.