简体   繁体   English

张量估计器的精度和损失为零

[英]tensorflow estimator accuracy and loss is zero

My model's accuracy and loss are evaluating to 0. 我模型的准确性和损失评估为0。
The global steps should be 1625 but it's 1. 全局步骤应为1625,但应为1。
The acc and loss shouldn't be equal to 0 as both of them are contradicting each other. acc和loss不应等于0,因为它们彼此矛盾。

My input function,keras estimator,train_and_evaluate are as follows: 我的输入函数,keras估计器,train_and_evaluate如下:

def make_input_fn(addrs,labels,batch_size,mode):

 filename_dataset = tf.data.Dataset.from_tensor_slices((addrs,labels))     

 dataset = filename_dataset.apply(tf.contrib.data.map_and_batch(lambda 
 addrs, labels: tuple(tf.py_func(
    process, [addrs, labels], [tf.uint8, labels.dtype])),batch_size,

 num_parallel_batches=2,

 drop_remainder=False))
 if mode == tf.estimator.ModeKeys.TRAIN:
  num_epochs = None # indefinitely
  dataset = dataset.apply(tf.contrib.data.shuffle_and_repeat(buffer_size = 10000))
 else:
  num_epochs = 1
  dataset = dataset.repeat(num_epochs)

 dataset = dataset.prefetch(buffer_size=batch_size)
 images,labels = dataset.make_one_shot_iterator().get_next()
 images.set_shape([None,512,512,3])
 labels.set_shape([None,1])
 return images,labels

def keras_estimator(model_dir,config):
 base_model = Xception(weights='imagenet', include_top=False,input_shape = 
  (512,512,3),classes = 5)
 x = base_model.output
 x = GlobalAveragePooling2D()(x)

 x = Dense(1024, activation='relu')(x)
 x = Dropout(0.2)(x)
 x = Dense(256, activation='relu')(x)
 x = Dropout(0.2)(x)

 predictions = Dense(5, activation='softmax')(x)


 model = Model(inputs=base_model.input, outputs=predictions)


 for layer in base_model.layers:
   layer.trainable = False
 model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', 
       metrics=['acc'])


 estimator=tf.keras.estimator.model_to_estimator(keras_model=model,
      model_dir=model_dir,
      config=config)
 return estimator

def train_and_evaluate(model_dir):
 t_batch_size = 512
 e_batch_size = 64
 num_epochs = 25
 import pandas as pd
 df = pd.read_csv('/content/trainLabels.csv')
 from random import shuffle
 addrs = ['/content/train/train/' + str(df.iloc[i]['image']) + '.jpeg' for i 
 in range(len(df))]
 labels = df['level'].values.tolist()
 c = list(zip(addrs, labels))
 shuffle(c)
 addrs1, labels1 = zip(*c)
 train_addrs = addrs1[0 : int(0.9 * len(addrs))]
 train_labels = labels1[0 : int(0.9 * len(labels))]
 val_addrs = addrs1[ int(0.9 * len(addrs)) : ]
 val_labels = labels1[ int(0.9 * len(addrs)) : ]
 train_addrs = list(train_addrs)
 train_labels = list(train_labels)
 val_addrs = list(val_addrs)
 val_labels = list(val_labels)

 run_config = tf.estimator.RunConfig(save_checkpoints_secs=300)

 estimator = keras_estimator(model_dir,run_config)

 t_max_steps = (len(train_addrs) // t_batch_size) * num_epochs

 train_spec = tf.estimator.TrainSpec(input_fn = lambda : 
 make_input_fn(train_addrs,train_labels,
 t_batch_size,mode=tf.estimator.ModeKeys.TRAIN),max_steps = t_max_steps)

 eval_spec = tf.estimator.EvalSpec(input_fn = lambda : 
 make_input_fn(val_addrs,val_labels,
 e_batch_size,mode=tf.estimator.ModeKeys.EVAL),steps = 
 None,start_delay_secs=10,
    throttle_secs=300)


 tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

Here are the log files: 以下是日志文件:

INFO:tensorflow:Running training and evaluation locally (non-distributed). INFO:tensorflow:在本地运行培训和评估(非分布式)。 INFO:tensorflow:Start train and evaluate loop. INFO:tensorflow:开始训练并评估循环。 The evaluate will happen after every checkpoint. 评估将在每个检查点之后进行。 Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 300. WARNING:tensorflow:From :9: map_and_batch (from tensorflow.contrib.data.python.ops.batching) is deprecated and will be removed in a future version. 检查点频率是根据RunConfig参数确定的:save_checkpoints_steps无或save_checkpoints_secs300。警告:tensorflow:自:9:map_and_batch(来自tensorflow.contrib.data.python.ops.batching)已弃用,并将在以后的版本中删除。 Instructions for updating: Use tf.data.experimental.map_and_batch(...). 更新说明:使用tf.data.experimental.map_and_batch(...)。 WARNING:tensorflow:From :12: shuffle_and_repeat (from tensorflow.contrib.data.python.ops.shuffle_ops) is deprecated and will be removed in a future version. 警告:tensorflow:自:12:shuffle_and_repeat(来自tensorflow.contrib.data.python.ops.shuffle_ops)已过时,并将在以后的版本中删除。 Instructions for updating: Use tf.data.experimental.shuffle_and_repeat(...). 更新说明:使用tf.data.experimental.shuffle_and_repeat(...)。 INFO:tensorflow:Calling model_fn. INFO:tensorflow:调用model_fn。 INFO:tensorflow:Done calling model_fn. INFO:tensorflow:完成调用model_fn。 INFO:tensorflow:Warm-starting with WarmStartSettings: WarmStartSettings(ckpt_to_initialize_from='/content/training/keras/keras_model.ckpt', vars_to_warm_start='.*', var_name_to_vocab_info={}, var_name_to_prev_var_name={}) INFO:tensorflow:Warm-starting from: ('/content/training/keras/keras_model.ckpt',) INFO:tensorflow:Warm-starting variable: dense/kernel; INFO:tensorflow:以WarmStartSettings进行暖启动:WarmStartSettings(ckpt_to_initialize_from ='/ content / training / keras / keras_model.ckpt',vars_to_warm_start ='。*',var_name_to_vocabvinfo = {},var_name_to_tens_arm:从以下位置开始:('/content/training/keras/keras_model.ckpt',)INFO:tensorflow:热启动变量:密集/内核; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: dense/bias; prev_var_name:不变的INFO:tensorflow:暖启动变量:densed / bias; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: dense_1/kernel; prev_var_name:不变的INFO:tensorflow:温暖的开始变量:density_1 / kernel; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: dense_1/bias; prev_var_name:不变的INFO:tensorflow:暖启动变量:deny_1 / bias; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: dense_2/kernel; prev_var_name:不变的INFO:tensorflow:暖启动变量:deny_2 / kernel; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: dense_2/bias; prev_var_name:不变的INFO:tensorflow:暖启动变量:deny_2 / bias; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: Adam/iterations; prev_var_name:不变的INFO:tensorflow:暖启动变量:Adam / iterations; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: Adam/lr; prev_var_name:不变的INFO:tensorflow:暖启动变量:Adam / lr; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: Adam/beta_1; prev_var_name:不变的INFO:tensorflow:暖启动变量:Adam / beta_1; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: Adam/beta_2; prev_var_name:不变的INFO:tensorflow:暖启动变量:Adam / beta_2; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: Adam/decay; prev_var_name:不变的INFO:tensorflow:暖启动变量:Adam / decay; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: training/Adam/Variable; prev_var_name:不变的INFO:tensorflow:热启动变量:training / Adam / Variable; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: training/Adam/Variable_1; prev_var_name:不变的INFO:tensorflow:暖启动变量:training / Adam / Variable_1; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: training/Adam/Variable_2; prev_var_name:不变的INFO:tensorflow:暖启动变量:training / Adam / Variable_2; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: training/Adam/Variable_3; prev_var_name:不变的INFO:tensorflow:暖启动变量:training / Adam / Variable_3; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: training/Adam/Variable_4; prev_var_name:不变的INFO:tensorflow:暖启动变量:training / Adam / Variable_4; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: training/Adam/Variable_5; prev_var_name:不变的INFO:tensorflow:暖启动变量:training / Adam / Variable_5; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: training/Adam/Variable_6; prev_var_name:不变的INFO:tensorflow:暖启动变量:training / Adam / Variable_6; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: training/Adam/Variable_7; prev_var_name:不变的INFO:tensorflow:暖启动变量:training / Adam / Variable_7; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: training/Adam/Variable_8; prev_var_name:不变的INFO:tensorflow:暖启动变量:training / Adam / Variable_8; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: training/Adam/Variable_9; prev_var_name:不变的INFO:tensorflow:暖启动变量:training / Adam / Variable_9; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: training/Adam/Variable_10; prev_var_name:不变的INFO:tensorflow:暖启动变量:training / Adam / Variable_10; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: training/Adam/Variable_11; prev_var_name:不变的INFO:tensorflow:暖启动变量:training / Adam / Variable_11; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: training/Adam/Variable_12; prev_var_name:不变的INFO:tensorflow:暖启动变量:training / Adam / Variable_12; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: training/Adam/Variable_13; prev_var_name:不变的INFO:tensorflow:暖启动变量:training / Adam / Variable_13; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: training/Adam/Variable_14; prev_var_name:不变的INFO:tensorflow:暖启动变量:training / Adam / Variable_14; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: training/Adam/Variable_15; prev_var_name:不变的INFO:tensorflow:暖启动变量:training / Adam / Variable_15; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: training/Adam/Variable_16; prev_var_name:不变的INFO:tensorflow:暖启动变量:training / Adam / Variable_16; prev_var_name: Unchanged INFO:tensorflow:Warm-starting variable: training/Adam/Variable_17; prev_var_name:不变的INFO:tensorflow:暖启动变量:training / Adam / Variable_17; prev_var_name: Unchanged INFO:tensorflow:Create CheckpointSaverHook. prev_var_name:不变的INFO:tensorflow:创建CheckpointSaverHook。 INFO:tensorflow:Graph was finalized. INFO:tensorflow:Graph已完成。 INFO:tensorflow:Running local_init_op. INFO:tensorflow:正在运行local_init_op。 INFO:tensorflow:Done running local_init_op. INFO:tensorflow:已运行local_init_op。 INFO:tensorflow:Saving checkpoints for 0 into /content/training/model.ckpt. INFO:tensorflow:将0的检查点保存到/content/training/model.ckpt中。 INFO:tensorflow:Saving checkpoints for 1 into /content/training/model.ckpt. INFO:tensorflow:将1的检查点保存到/content/training/model.ckpt中。 INFO:tensorflow:Calling model_fn. INFO:tensorflow:调用model_fn。 INFO:tensorflow:Done calling model_fn. INFO:tensorflow:完成调用model_fn。 INFO:tensorflow:Starting evaluation at 2018-11-05-13:21:17 INFO:tensorflow:Graph was finalized. INFO:tensorflow:于2018年11月5日13:21:17开始评估INFO:tensorflow:图已完成。 INFO:tensorflow:Restoring parameters from /content/training/model.ckpt-1 INFO:tensorflow:Running local_init_op. INFO:tensorflow:从/content/training/model.ckpt-1恢复参数INFO:tensorflow:运行local_init_op。 INFO:tensorflow:Done running local_init_op. INFO:tensorflow:已运行local_init_op。 INFO:tensorflow:Finished evaluation at 2018-11-05-13:22:08 INFO:tensorflow:Saving dict for global step 1: acc = 0.0, global_step = 1, loss = 0.0 INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1: /content/training/model.ckpt-1 INFO:tensorflow:Loss for final step: None. INFO:tensorflow:在2018年11月5日13:22:08完成评估INFO:tensorflow:保存全局步骤1的dict:acc = 0.0,global_step = 1,损失= 0.0 INFO:tensorflow:保存``checkpoint_path''摘要全局步骤1:/content/training/model.ckpt-1 INFO:tensorflow:最后一步的损失:无。

I had this issue before. 我以前有这个问题。 It was because that I specified the wrong directory for the data sets. 这是因为我为数据集指定了错误的目录。 Ultimately tensorflow had no input data. 最终,张量流没有输入数据。 I hope this helps. 我希望这有帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM