简体   繁体   中英

LSTM multiple features, multiple classes, multiple outputs

I'm trying to use a LSTM classifier to generate music based on some midi's that I have.

The LSTM uses two features, the notes' pitch and the notes' duration.

For illustration, let's think we have:

  • Pitches: ["A", "B", "C"]

  • Durations: ["0.5", "1", "1.5"]

As you can imagine, a generated note has to have both pitch and duration.

I tried to do it with a MultiLabelBinarizer.

from sklearn.preprocessing import MultiLabelBinarizer
labels = [[x,y] for x in all_pitches for y in all_durations]

mlb = MultiLabelBinarizer()
mlb_value = mlb.fit_transform(labels)

This divides the classes as intended, but the problem I'm having comes at the time of predictions.

prediction = model.predict_proba(prediction_input)

indexes = np.argsort(prediction, axis=None)[::-1]
index1 = indexes[0]
index2 = indexes[1]

result1 = mlb.classes_[index1]
result2 = mlb.classes_[index2]

I need the notes to have both pitch and duration, so this approach seems to not work for me (I only get the same two pitches all over).

Another thing I thought was using a MultiOutputClassifier , but I seem unable to understand the differences of them, or how to actually use this MultiOutputClassifier correctly.

Thanks for the patience, and sorry for the probably stupid question.

You can feed your LSTM output into many different layers (or neural functions, in general), which lead to different outputs, and then train your model on each of these outputs concurrently:

from keras.models import Model
from keras.layers import Input, Dense, LSTM

# function definitions
lstm_function = LSTM(..args)
pitch_function = Dense(num_pitches, activation='softmax')
duration_function = Dense(num_durations, activation='softmax')
input_features = Input(input_dimensionality)

# function applications
lstm_output = lstm_function(input_features)
pitches = pitch_function(lstm_output)
durations = duration_function(lstm_output)

# model 
model = Model(inputs=[input_features], outputs=[pitches, durations])
model.compile(loss=['categorical_crossentropy', 'mse'], optimizer='RMSProp')

This may be generalized to arbitrary information flows, with as many layers/outputs as you need. Remember that for each output you need to define a corresponding loss (or None ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM