TensorFlow LSTM length output

Question

So basically I have an LSTM model which takes in a bunch of numbers (These numbers are actually music notes that I changed into numbers. My goal is to create computer generated music if you were wondering). The issue I am running into is that I do not know how to make a prediction. What I want the computer to output is a list (or string or whatever it can) of numbers that follows whatever rules it came up with during the training proccess. In previous projects, I only knew how to output 1 prediction number, giving the computer some data to predict on, but I want a completely new list without giving the computer a starting value. Preferably the computer can generate more than 1 number at a time.

Here is the code that I currently have. It does not work right now:

n_steps = 1
X, y = split_sequence(data, n_steps)
X = X.reshape((X.shape[0], X.shape[1], 1))
X = tf.cast(X, dtype='float32')

model = Sequential()
model.add(LSTM(256, activation='relu', return_sequences=True))
#model.add(Dropout(0.2)) # I am not sure what this is, but it doesn't break my code
model.add(LSTM(128, activation='relu', return_sequences=True))
#model.add(Dropout(0.2))
model.add(LSTM(128))
#model.add(Dropout(0.2))
model.add(Dense(1, activation='linear'))
model.compile(optimizer='adam', loss='mse', metrics=['mae'])

model.fit(X, y, epochs=10, batch_size=2, verbose=2)


prediction = model.predict(X) # I want to output a list of numbers
print(prediction)

Now, my prediction is outputting a realy long list of lists containing the same value which I think is the one prediction. It looks like this:

[[62.449333]
 [62.449333]
 [62.449333]
 ...
 [62.449333]
 [62.449333]
 [62.449333]]

I want a list that is not a prediction, but more like a GAN output of a brand new list of numbers. Also, I am not sure why this prediction is outputing a really long list of lists.

data looks something like this, it is shortened for brevity:

[64, 76, 64, 75, 64, 76, 64, 75, 64, 76, 64, 71, 64, 74, 64, 72, 69, 64, 45, 64, 52]

The x train looks like this when n_steps = 1:

[[64], [76], [64], [75], [64], [76], [64], [75], [64], [76], [64], [71], [64], [74]]

and y looks like this, with each one being the expectpected output for the corresponding x train:

[76, 64, 75, 64, 76, 64, 75, 64, 76, 64, 71, 64, 74, 64]

Any help will be greatly appreciated!!

Answer 1

My understanding is that 1) you want to output a list of different predictions (ie you are concerned by the fact that each prediction takes the same value). 2) you want a list of floats rather than a list of lists containing floats.

Working backwards:

2, When making predictions using neural networks and model.predict() , it outputs an array of lists when fed several x_test samples - this is easily solvable by using ndarray.flatten() thus in your case: prediction = model.predict(X).flatten()

1, I would first investigate that you are reshaping your input data correctly. LSTM models takes 3D data as inputs and thus expects a 3D array, in the form (sequence, timestep, feature):

Sequence, is the total number of sequences in your dataset (or observations/samples)

Timestep, corresponds to the size of your sequences

Feature, is the number of observations at a timestep (/number of variables)

you have therefore defined only one feature for each timestep. If this is the case you are feeding it (x.shape[0] number of samples, with x.shape[1] number of timesteps in the sequence, with only 1 feature/variable) - is this your intention? It would be helpful to get more info on the data being used ie number of features and desired window size for sequence.

Answer 2

I think the structure of your model is fine but the data needs some work. Your LSTM is only set up to output 1 value, which you can see with your last LSTM layer not having return_sequences=True . The fact that your y labels have multiple values must be confusing the model.

I think you should keep this behaviour, but edit your input/output data as follows:

If one sequence in your data is:

[64, 76, 64, 75, 64, 76, 64, 75, 64, 76, 64, 71, 64, 74, 64, 72, 69, 64, 45, 64, 52]

Then your training examples and labels should be:

x[0] = [64]
y[1] = [76]

x[1] = [64, 76]
y[1] = [64]

x[2] = [64, 76, 64]
y[2] = [75]

Every step of the sequence can be a separate training example, but each y label should only be one output.

Your linear output could work, but I think this may work better as a categorical problem with a softmax output. The final dense layer should have the number of possible notes that your model can output. You would also have to pad these sequences with 0 values so that all your x input values are the same length, so x values would actually be:

x[0] = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 64]
x[1] = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 64, 75]

etc. with the length of the array being your max sequence length.

When it comes to predicting, use a loop. You'd give the model a one value inut sequence, then append the predicted note to the input sequence and feed it back to the model again:

seed_note = [64] # initial note to give the model
next_notes = 10 # how many notes to predict

for _ in range(next_notes):
    token_list = pad_sequences(seed_note, maxlen=max_sequence_len, padding='pre') # pad sequence with 0s
    predicted = np.argmax(model.predict(token_list), axis=-1) # get best prediction for next note
    seed_note += [predicted]
    
print(seed_text)

TensorFlow LSTM length output

Question

2 answers

solution1
1 2021-03-15 15:32:06

solution2
1 ACCPTED 2021-03-15 15:46:20

TensorFlow LSTM length output

Question

2 answers

solution1 1 2021-03-15 15:32:06

solution2 1 ACCPTED 2021-03-15 15:46:20

solution1
1 2021-03-15 15:32:06

solution2
1 ACCPTED 2021-03-15 15:46:20