使用sess.run（）时Tensorflow崩溃

Question

I'm using tensorflow 0.8.0 with Python v2.7. 我在Python v2.7中使用tensorflow 0.8.0。 My IDE is PyCharm and my os is Linux Ubuntu 14.04 我的IDE是PyCharm，我的操作系统是Linux Ubuntu 14.04

I'm noticing that the following code causes my computer to freeze and/or crash: 我注意到以下代码导致我的计算机冻结和/或崩溃：

# you will need these files!
# https://www.kaggle.com/c/digit-recognizer/download/train.csv
# https://www.kaggle.com/c/digit-recognizer/download/test.csv

import numpy as np
import pandas as pd
import tensorflow as tf
import matplotlib.pyplot as plt
import matplotlib.cm as cm

# read in the image data from the csv file
# the format is:    imagelabel  pixel0  pixel1 ... pixel783  (there are 42,000 rows like this)
data = pd.read_csv('../train.csv')
labels = data.iloc[:,:1].values.ravel()  # shape = (42000, 1)
labels_count = np.unique(labels).shape[0]  # = 10
images = data.iloc[:,1:].values   # shape = (42000, 784)
images = images.astype(np.float64)
image_size = images.shape[1]
image_width = image_height = np.sqrt(image_size).astype(np.int32)  # since these images are sqaure... hieght = width


# turn all the gray-pixel image-values into percentages of 255
# a 1.0 means a pixel is 100% black, and 0.0 would be a pixel that is 0% black (or white)
images = np.multiply(images, 1.0/255)


# create oneHot vectors from the label #s
oneHots = tf.one_hot(labels, labels_count, 1, 0)  #shape = (42000, 10)


#split up the training data even more (into validation and train subsets)
VALIDATION_SIZE = 3167

validationImages = images[:VALIDATION_SIZE]
validationLabels = labels[:VALIDATION_SIZE]

trainImages = images[VALIDATION_SIZE:]
trainLabels = labels[VALIDATION_SIZE:]






# -------------  Building the NN -----------------

# set up our weights (or kernals?) and biases for each pixel
def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=.1)
    return tf.Variable(initial)

def bias_variable(shape):
    initial = tf.constant(.1, shape=shape, dtype=tf.float32)
    return tf.Variable(initial)


# convolution
def conv2d(x, W):
    return tf.nn.conv2d(x, W, [1,1,1,1], 'SAME')

# pooling
def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')


# placeholder variables
# images
x = tf.placeholder('float', shape=[None, image_size])
# labels
y_ = tf.placeholder('float', shape=[None, labels_count])



# first convolutional layer
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])

# turn shape(40000,784)  into   (40000,28,28,1)
image = tf.reshape(trainImages, [-1,image_width , image_height,1])
image = tf.cast(image, tf.float32)
# print (image.get_shape()) # =>(40000,28,28,1)




h_conv1 = tf.nn.relu(conv2d(image, W_conv1) + b_conv1)
# print (h_conv1.get_shape()) # => (40000, 28, 28, 32)
h_pool1 = max_pool_2x2(h_conv1)
# print (h_pool1.get_shape()) # => (40000, 14, 14, 32)





# second convolutional layer
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
#print (h_conv2.get_shape()) # => (40000, 14,14, 64)
h_pool2 = max_pool_2x2(h_conv2)
#print (h_pool2.get_shape()) # => (40000, 7, 7, 64)




# densely connected layer
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])

# (40000, 7, 7, 64) => (40000, 3136)
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])

h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
#print (h_fc1.get_shape()) # => (40000, 1024)





# dropout
keep_prob = tf.placeholder('float')
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
print h_fc1_drop.get_shape()


#readout layer for deep neural net
W_fc2 = weight_variable([1024,labels_count])
b_fc2 = bias_variable([labels_count])
print b_fc2.get_shape()
mull= tf.matmul(h_fc1_drop, W_fc2)
print mull.get_shape()
print
mull2 = mull + b_fc2
print mull2.get_shape()

y = tf.nn.softmax(mull2)



# dropout
keep_prob = tf.placeholder('float')
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)


sess = tf.Session()
sess.run(tf.initialize_all_variables())

print sess.run(mull[0,2])

The lase line causes the crash: 激光线导致崩溃：

print sess.run(mull[0,2]) print sess.run（mull [0,2]）

This is basically one location in a very big 2d array. 这基本上是一个非常大的二维阵列中的一个位置。 Something about the sess.run is causing it. 关于sess.run的一些事情正在引发它。 I'm also getting a script issue popup... some sort of google script (think maybe it's tensorflow?). 我也得到一个脚本问题弹出窗口...某种谷歌脚本（想想也许它是张量流？）。 I can't copy the link because my computer is completely frozen. 我无法复制链接，因为我的计算机已完全冻结。

Answer 1

I suspect the problem arises because mull[0, 2] —despite its small apparent size—depends on a very large computation, including multiple convolutions, max-poolings, and a large matrix multiplication; 我怀疑问题出现是因为mull[0, 2]尽管其小的表观大小 - 取决于非常大的计算，包括多个卷积，最大池和大的矩阵乘法; and therefore either your computer becomes fully loaded for a long period of time, or it runs out of memory. 因此，您的计算机要么长时间满载，要么内存不足。 (You should be able to tell which by running top and checking what resources are used by the python process in which you are running TensorFlow.) （您应该能够通过运行top检查哪个，并检查运行TensorFlow的python进程使用了哪些资源。）

The amount of computation is so large because your TensorFlow graph is defined in terms of the entire training dataset, trainImages , which contains 40000 images: 计算量非常大，因为您的TensorFlow图是根据整个训练数据集trainImages ，其中包含40000张图像：

image = tf.reshape(trainImages, [-1,image_width , image_height,1])
image = tf.cast(image, tf.float32)

Instead, it would be more efficient to define your network in terms of a tf.placeholder() to which you can feed individual training examples, or mini-batches of examples. 相反，根据tf.placeholder()来定义您的网络会更有效，您可以tf.placeholder() 提供单个培训示例或小批量示例。 See the documentation on feeding for more information. 有关更多信息，请参阅有关喂食的文档。 In particular, since you are only interested in the 0th row of mull , you only need to feed the 0th example from trainImages and perform computation on it to produce the necessary values. 特别是，由于您只对第0行的mull感兴趣，因此您只需要从trainImages提供第0个示例并对其执行计算以生成必要的值。 (In your current program, the results for all other examples are also being computed, and then discarded in the final slice operator.) （在当前程序中，还会计算所有其他示例的结果，然后在最终切片运算符中将其丢弃。）

Answer 2

Setting the session as default, and initializing your variable before running the session may solve your problem. 将会话设置为默认值，并在运行会话之前初始化变量可以解决您的问题。

import tensorflow as tf

sess = tf.Session()
g = tf.ones([25088])

sess.as_default():
    tf.initialize_all_variables().run()
    results = sess.run(g)

    print results

使用sess.run（）时Tensorflow崩溃

问题描述

2 个解决方案

解决方案1
1 已采纳 2016-05-31 21:07:25

解决方案2
0 2016-05-28 23:42:36

使用sess.run（）时Tensorflow崩溃

问题描述

2 个解决方案

解决方案1 1 已采纳 2016-05-31 21:07:25

解决方案2 0 2016-05-28 23:42:36

解决方案1
1 已采纳 2016-05-31 21:07:25

解决方案2
0 2016-05-28 23:42:36