File “test_hdfs.py”, save_path = saver.save(sess, hdfs_path+“save_net.ckpt”) “Parent directory of {} doesn't exist, can't save.”.format(save_path))

Question

How can I use the saver.save and FileWriter function to write checkpoint files and event logs into hdfs directly?
I run my code:

W = tf.Variable([[1,2,3],[3,4,5]], dtype=tf.float32, name='weights')
b = tf.Variable([[1,2,3]], dtype=tf.float32, name='biases')
init = tf.global_variables_initializer()
saver = tf.train.Saver()

with tf.Session() as sess:
   sess.run(init)
   save_path = saver.save(sess, hdfs_path+"save_net.ckpt")
   print("Save to path: ", hdfs_path)

When I replace the hdfs_path to a local path, it runs ok. But when I run a hdfs_path:

File "test_hdfs.py", line 73, in <module>
    save_path = saver.save(sess, hdfs_path+"save_net.ckpt")
  File "/data/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1354, in save
    "Parent directory of {} doesn't exist, can't save.".format(save_path))

This happens similarly when I use tf.summary.FileWriter function. The program is stucked when I use hdfs_path. When I use local_path, it runs ok.

My whole code is like this:

hdfs_path="hdfs://*" 
local_path = "./" 
with tf.Session(graph=tf.get_default_graph()) as sess: 
    W = tf.Variable([[1,2,3],[3,4,5]], dtype=tf.float32, name='weights') 
    b = tf.Variable([[1,2,3]], dtype=tf.float32, name='biases') 
    init = tf.group(tf.global_variables_initializer(),tf.local_variables_initializer()) 
    saver = tf.train.Saver() 
    sess.run(init) 
    summary_writer = tf.summary.FileWriter(hdfs_path,graph_def=sess.graph_def) 
    saver.save(sess,save_path=hdfs_path+"save_net.ckpt")

Answer 1

When launching your TensorFlow program, the following environment variables must be set:

JAVA_HOME: The location of your Java installation. HADOOP_HDFS_HOME: The location of your HDFS installation. You can also set this environment variable by running:

shell source ${HADOOP_HOME}/libexec/hadoop-config.sh

LD_LIBRARY_PATH: To include the path to libjvm.so, and optionally the path to libhdfs.so if your Hadoop distribution does not install libhdfs.so in $HADOOP_HDFS_HOME/lib/native. On Linux:

shell export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${JAVA_HOME}/jre/lib/amd64/server:$HADOOP_HDFS_HOME/lib/native

The Hadoop jars must be added prior to running your TensorFlow program. The CLASSPATH set by ${HADOOP_HOME}/libexec/hadoop-config.sh is insufficient. Globs must be expanded as described in the libhdfs documentation:

then uses shell find /hadoop_home/ -name *.jar|awk '{ printf("export CLASSPATH=%s:$CLASSPATH\\n", $0); }' shell find /hadoop_home/ -name *.jar|awk '{ printf("export CLASSPATH=%s:$CLASSPATH\\n", $0); }' to add hadoop jar to your path. After export all of the print, use shell python your_script.py to run.

File “test_hdfs.py”, save_path = saver.save(sess, hdfs_path+“save_net.ckpt”) “Parent directory of {} doesn't exist, can't save.”.format(save_path))

Question

1 answers

solution1
0 ACCPTED 2017-11-29 05:09:51

File “test_hdfs.py”, save_path = saver.save(sess, hdfs_path+“save_net.ckpt”) “Parent directory of {} doesn't exist, can't save.”.format(save_path))

Question

1 answers

solution1 0 ACCPTED 2017-11-29 05:09:51

solution1
0 ACCPTED 2017-11-29 05:09:51