简体   繁体   中英

getting TypeError: Expected int32, got None of type 'NoneType' instead

I have implemented sequence to sequence model with attention layer if I 300000 data points I'm not getting any error if I use all of my data points I'm getting following error model.fit

TypeError: Expected int32, got None of type 'NoneType' instead.

在此处输入图像描述

what would be the reason for this?

the code before model.fit is

class encoder_decoder(tf.keras.Model):
  def __init__(self,embedding_size,encoder_inputs_length,output_length,vocab_size,output_vocab_size,score_fun,units):
    super(encoder_decoder,self).__init__()
    self.vocab_size = vocab_size
    self.enc_units = units
    self.embedding_size = embedding_size
    self.encoder_inputs_length = encoder_inputs_length
    self.output_length = output_length
    self.lstm_output = 0
    self.state_h = 0
    self.state_c = 0
    self.output_vocab_size = output_vocab_size
    self.dec_units = units
    self.score_fun = score_fun
    self.att_units = units
    self.encoder=Encoder(self.vocab_size,self.embedding_size,self.enc_units,self.encoder_inputs_length)
    self.decoder = Decoder(self.output_vocab_size, self.embedding_size, self.output_length, self.dec_units ,self.score_fun ,self.att_units)
    # self.dense = Dense(self.output_vocab_size,activation = "softmax")
  
  def call(self,data):
    input,output = data[0],data[1]
    encoder_hidden = self.encoder.initialize_states(input.shape[0])
    encoder_output,encoder_hidden,encoder_cell = self.encoder(input,encoder_hidden)
    decoder_hidden = encoder_hidden
    decoder_cell =encoder_cell
    decoder_output = self.decoder(output,encoder_output,decoder_hidden,decoder_cell)
    return decoder_output

Inside the call function I'm initializing states for the encoder where I'm getting the number of rows from input using the following line of code

 encoder_hidden = self.encoder.initialize_states(input.shape[0])

If I print input, I'm getting shape as (None,55) That's the reason I'm getting this error. Here my total number data points is 330614 when I use all my data I getting this error, when I use only 330000 data points I'm getting this error, if I print batch inside def method I'm getting shape as (64,55)

Please find my below code for creating dataset for my sequence to sequence model

the function to reprocess the data and the function to create the dataset and a function the load the dataset

def preprocess_sentence(w):
  # w = unicode_to_ascii(w.lower().strip())
  w = re.sub(r"([?.!,¿])", r" \1 ", w)
  w = re.sub(r'[" "]+', " ", w)
  w = re.sub(r"[^a-zA-Z?.!,¿]+", " ", w)
  w = w.strip()
  w = '<start> ' + w + ' <end>'
  return w  
def create_dataset(path, num_examples):
  lines = io.open(path, encoding='UTF-8').read().strip().split('\n')
  # lines1 = lines[330000:]
  # lines = lines[0:323386]+lines1

  word_pairs = [[preprocess_sentence(w) for w in l.split('\t')]  for l in lines[:num_examples]]
  word_pairs = [[i[0],i[1]] for i in word_pairs]
  return zip(*word_pairs)

def tokenize(lang):
  lang_tokenizer = tf.keras.preprocessing.text.Tokenizer(
      filters='')
  lang_tokenizer.fit_on_texts(lang)

  tensor = lang_tokenizer.texts_to_sequences(lang)

  tensor = tf.keras.preprocessing.sequence.pad_sequences(tensor,padding='post')
  return tensor, lang_tokenizer

def load_dataset(path, num_examples=None):
  # creating cleaned input, output pairs
  targ_lang, inp_lang = create_dataset(path, num_examples)

  input_tensor, inp_lang_tokenizer = tokenize(inp_lang)
  target_tensor, targ_lang_tokenizer = tokenize(targ_lang)

  return input_tensor, target_tensor, inp_lang_tokenizer, targ_lang_tokenizer,targ_lang,inp_lang

# Try experimenting with the size of that dataset
num_examples = None
input_tensor, target_tensor, inp_lang, targ_lang,targ_lang_text,inp_lang_text = load_dataset(path, num_examples)

# Calculate max_length of the target tensors
max_length_targ, max_length_inp = target_tensor.shape[1], input_tensor.shape[1]
max_length_targ,max_length_inp

input_tensor_train, input_tensor_val, target_tensor_train, target_tensor_val = train_test_split(input_tensor, target_tensor, test_size=0.2)

the shape of datasets as follows

shape of input train  (269291, 55)
shape of target train  (269291, 53)
shape of input test (67323, 55)
shape of target test (67323, 53)

You can share the code block before the model.fit.

NoneType error is indicating that the final array which is passed to the model is for some reason empty. You can add print statements at previous steps to understand where along the way your array became empty.

Compare the scenario to the case where you are taking all your data points so that you can understand where the array is changing and how it is handled prior to passing it through model.fit.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM