[英]TensorFlow: Reading and using data from CSV file
我在這里嘗試了Tensorflow提供的代碼
我也嘗試過Nicolas提供的解決方案,但遇到錯誤:
ValueError:Shape()的排名必須至少為1
但是我無法操作代碼,因此我可以抓取數據並將其放在train_X
和train_Y
變量中。
我目前正在將硬編碼數據用於變量train_X
和train_Y
。
我的csv文件包含2列,即高度和荷電狀態(SoC),其中height是浮點值,SoC是從0開始以10為增量到最大為100的整數(Int)。
我想從列中獲取數據並在線性回歸模型中使用它,其中Height是Y值,而SoC是x值。
這是我的代碼:
filename_queue = tf.train.string_input_producer("battdata.csv")
reader = tf.TextLineReader()
key, value = reader.read(filename_queue)
# Default values, in case of empty columns. Also specifies the type of the
# decoded result.
record_defaults = [[1], [1]]
col1, col2= tf.decode_csv(
value, record_defaults=record_defaults)
features = tf.stack([col1, col2])
with tf.Session() as sess:
# Start populating the filename queue.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
for i in range(1200):
# Retrieve a single instance:
example, label = sess.run([features, col2])
coord.request_stop()
coord.join(threads)
我想在此模型中更改使用csv數據:
# Parameters
learning_rate = 0.01
training_epochs = 1000
display_step = 50
# Training Data
train_X = numpy.asarray([3.3,4.4,5.5,6.71,6.93,4.168,9.779,6.182,7.59,2.167,
7.042,10.791,5.313,7.997,5.654,9.27,3.1])
train_Y = numpy.asarray([1.7,2.76,2.09,3.19,1.694,1.573,3.366,2.596,2.53,1.221,
2.827,3.465,1.65,2.904,2.42,2.94,1.3])
n_samples = train_X.shape[0]
# tf Graph Input
X = tf.placeholder("float")#Charge
Y = tf.placeholder("float")#Height
# Set model weights
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")
# Construct a linear model
pred = tf.add(tf.multiply(X, W), b) # XW + b <- y = mx + b where W is gradient, b is intercept
# Mean squared error
cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)
# Gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
# Initializing the variables
init = tf.global_variables_initializer()
# Launch the graph
with tf.Session() as sess:
sess.run(init)
# Fit all training data
for epoch in range(training_epochs):
for (x, y) in zip(train_X, train_Y):
sess.run(optimizer, feed_dict={X: x, Y: y})
#Display logs per epoch step
if (epoch+1) % display_step == 0:
c = sess.run(cost, feed_dict={X: train_X, Y:train_Y})
print( "Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), \
"W=", sess.run(W), "b=", sess.run(b))
print("Optimization Finished!")
training_cost = sess.run(cost, feed_dict={X: train_X, Y: train_Y})
print ("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')
#Graphic display
plt.plot(train_X, train_Y, 'ro', label='Original data')
plt.plot(train_X, sess.run(W) * train_X + sess.run(b), label='Fitted line')
plt.legend()
plt.show()
編輯:
我也嘗試過Nicolas提供的解決方案,但遇到錯誤:
ValueError:Shape()的排名必須至少為1
我通過在文件名周圍添加方括號來解決此問題,如下所示:
filename_queue = tf.train.string_input_producer(['battdata.csv'])
您需要做的就是用從decode_csv
方法獲得的op替換placeholder
張量。 這樣,無論何時運行optimiser
,TensorFlow圖都將要求通過各種Tensor依賴項從文件中讀取新行:
optimiser
=>
cost
=> pred
=> X
cost
=> Y
它會給出類似的信息:
filename_queue = tf.train.string_input_producer("battdata.csv")
reader = tf.TextLineReader()
key, value = reader.read(filename_queue)
# Default values, in case of empty columns. Also specifies the type of the
# decoded result.
record_defaults = [[1.], [1]]
X, Y = tf.decode_csv(
value, record_defaults=record_defaults)
# Set model weights
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")
# Construct a linear model
pred = tf.add(tf.multiply(X, W), b) # XW + b <- y = mx + b where W is gradient, b is intercept
# Mean squared error
cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)
# Gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
# Initializing the variables
init = tf.global_variables_initializer()
with tf.Session() as sess:
# Start populating the filename queue.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
# Fit all training data
for epoch in range(training_epochs):
_, cost_value = sess.run([optimizer, cost])
[...] # The rest of your code
coord.request_stop()
coord.join(threads)
我遇到了同樣的問題,問題得到解決,例如:
tf.train.string_input_producer(tf.train.match_filenames_once("medal.csv"))
在這里找到此內容: .TensorFlow從CSV到API
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.