How to implement multiprocessing in this Python script?

Question

I'm running this Python 3.5 script with Jupyter in my laptop, but the loop is very slow, so I start reading about how to speed up the code and I found that I can import a multiprocessing library to do this, but I don't know how to implement this in the script.

# Larger LSTM Network to Generate Text for Alice in Wonderland
import numpy
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import LSTM
from keras.callbacks import ModelCheckpoint
from keras.utils import np_utils
# load ascii text and covert to lowercase
filename = "wonderland.txt"
raw_text = open(filename).read()
raw_text = raw_text.lower()
# create mapping of unique chars to integers
chars = sorted(list(set(raw_text)))
char_to_int = dict((c, i) for i, c in enumerate(chars))
# summarize the loaded data
n_chars = len(raw_text)
n_vocab = len(chars)
print ("Total Characters: ", n_chars)
print ("Total Vocab: ", n_vocab)
# prepare the dataset of input to output pairs encoded as integers
seq_length = 100
dataX = []
dataY = []
for i in range(0, n_chars - seq_length, 1):
    seq_in = raw_text[i:i + seq_length]
    seq_out = raw_text[i + seq_length]
    dataX.append([char_to_int[char] for char in seq_in])
    dataY.append(char_to_int[seq_out])
n_patterns = len(dataX)
print ("Total Patterns: ", n_patterns)
# reshape X to be [samples, time steps, features]
X = numpy.reshape(dataX, (n_patterns, seq_length, 1))
# normalize
X = X / float(n_vocab)
# one hot encode the output variable
y = np_utils.to_categorical(dataY)
# define the LSTM model
model = Sequential()
model.add(LSTM(256, input_shape=(X.shape[1], X.shape[2]), return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(256))
model.add(Dropout(0.2))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
# define the checkpoint
filepath="weights-improvement-{epoch:02d}-{loss:.4f}-bigger.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=1, save_best_only=True, mode='min')
callbacks_list = [checkpoint]
# fit the model
model.fit(X, y, epochs=50, batch_size=64, callbacks=callbacks_list)

The script came from this tutorial .

Answer 1

The multiprocessing library helps in general for python scripts but in this case it is not so helpful as most of the logic is buried in the implementation of keras and its backends. 10 minutes for an epoch actually sounds reasonable for a neural network (these things are costly to run!), especially if you're running it without a GPU.

If you're using Tensorflow as a backend to Keras, all CPUs should automatically be used when executing model.fit(). You can double check this by looking at your favorite cpu monitor (eg htop) when you execute your code.

How to implement multiprocessing in this Python script?

Question

1 answers

solution1
0 ACCPTED 2017-08-22 13:47:11

How to implement multiprocessing in this Python script?

Question

1 answers

solution1 0 ACCPTED 2017-08-22 13:47:11

solution1
0 ACCPTED 2017-08-22 13:47:11