簡體   English   中英

使用 TPU 在 GCP 上進行 Keras/Tensorflow 培訓

[英]Keras/Tensorflow training on GCP with TPU

我正在嘗試使用 keras 和 tensorflow 1.15 在 GCP 上訓練 model。 從現在開始,我的代碼類似於我可以在 colab 上執行的操作,即:

# TPUs
import tensorflow as tf
print(tf.__version__)
cluster_resolver = tf.distribute.cluster_resolver.TPUClusterResolver("tpu-name")
tf.config.experimental_connect_to_cluster(cluster_resolver)
tf.tpu.experimental.initialize_tpu_system(cluster_resolver)
tpu_strategy = tf.distribute.experimental.TPUStrategy(cluster_resolver)
print("Number of accelerators: ", tpu_strategy.num_replicas_in_sync)


import numpy as np


np.random.seed(123)  # for reproducibility
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.keras.layers import Convolution2D, MaxPooling2D, Input
from tensorflow.keras import utils
from tensorflow.keras.datasets import mnist, cifar10
from tensorflow.keras.models import Model

# 4. Load data into train and test sets
(X_train, y_train) = load_data(sets="gs://BUCKETS/dogscats/train/",target_size=img_size)
(X_test, y_test) =  load_data(sets="gs://BUCKETS/dogscats/valid/",target_size=img_size)
print(X_train.shape, X_test.shape)

# 5. Preprocess input data
#X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
#X_test = X_test.reshape(X_test.shape[0], 28, 28,1)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

print(y_train.shape, y_test.shape)
# 6. Preprocess class labels One hot encoding
Y_train = utils.to_categorical(y_train, 2)
Y_test = utils.to_categorical(y_test, 2)
print(Y_train.shape, Y_test.shape)

with tpu_strategy.scope():
  model = make_model((img_size, img_size, 3))
  # 8. Compile model
  model.compile(loss='categorical_crossentropy',
                optimizer="sgd",
                metrics=['accuracy'])

model.summary()

batch_size = 1250 * tpu_strategy.num_replicas_in_sync
# 9. Fit model on training data
model.fit(X_train, Y_train, steps_per_epoch=len(X_train)//batch_size,  
            epochs=5, verbose=1)

但是我的數據在存儲桶上,而我的代碼在虛擬機上。 那我該怎么辦? 我嘗試使用“gs://BUCKETS”加載我的數據,但它不起作用。 我應該怎么辦? 編輯:我添加了我的代碼來加載數據,對不起,我忘記了。

def load_data(sets="dogcats/train/", k = 5000, target_size=250):
  # define location of dataset
  folder = sets
  photos, labels = list(), list()
  # determine class
  output = 0.0
  for i, dog in enumerate(listdir(folder + "dogs/")):
    if i >= k:
      break
    # load image
    photo = load_img(folder + "dogs/" +dog, target_size=(target_size, target_size))
    # convert to numpy array
    photo = img_to_array(photo)
    # store
    photos.append(photo)
    labels.append(output)

  output = 1.0

  for i, cat in enumerate(listdir(folder + "cats/") ):
    if i >= k:
      break
    # load image
    photo = load_img(folder + "cats/"+cat, target_size=(target_size, target_size))
    # convert to numpy array
    photo = img_to_array(photo)
    # store
    photos.append(photo)
    labels.append(output)

  # convert to a numpy arrays
  photos = asarray(photos)
  labels = asarray(labels)
  print(photos.shape, labels.shape)
  photos, labels = shuffle(photos, labels, random_state=0)
  return photos, labels

EDIT2:完成@daudnadeem 的答案,以防其他人處於相同情況。

我的目標是從存儲桶中獲取圖像,因此代碼運行良好並允許獲取字節 object。 要將其轉換為圖像,您只需使用 PIL 庫:

from PIL import Image
from io import BytesIO
import numpy as np

from google.cloud import storage
client = storage.Client()
bucket = client.get_bucket("BUCKETS")
blob = bucket.get_blob('dogscats/train/<you-will-need-to-point-to-a-file-and-not-a-directory>')
data = blob.download_as_string()

img = Image.open(BytesIO(data))
img = np.array(img)
(X_train, y_train) = load_data(sets="gs://BUCKETS/dogscats/train/",target_size=img_size)
(X_test, y_test) =  load_data(sets="gs://BUCKETS/dogscats/valid/",target_size=img_size)

這顯然是行不通的,因為基本上你所做的只是給定了一個字符串。 您需要做的是將此數據下載為字符串,然后使用它。

首先安裝 package pip install google-cloud-storagepip3 install google-cloud-storage

pip -> Python

pip3 -> Python3

看看這個,您將需要一個服務帳戶來從您的代碼中與 GCP 進行交互。 用於身份驗證

當您將服務帳戶設為 json 時,您需要執行以下兩項操作之一:

將其設置為環境變量: export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/[FILE_NAME].json"

或者我更喜歡的解決方法

gcloud auth activate-service-account \
  <repalce-with-email-from-json-file> \
          --key-file=<path/to/your/json/file> --project=<name-of-your-gcp-project>

現在讓我們看看如何使用 google-cloud-storage 庫將文件下載為字符串:

from google.cloud import storage
client = storage.Client()
bucket = client.get_bucket("BUCKETS")
blob = bucket.get_blob('/dogscats/train/<you-will-need-to-point-to-a-file-and-not-a-directory>')
data = blob.download_as_string()

現在您將數據作為字符串,您可以像這樣簡單地將data傳遞給加載數據(X_train, y_train) = load_data(sets=data,target_size=img_size)

這聽起來很復雜,但這里有一個快速的偽布局:

  1. 安裝谷歌雲存儲
  2. Go 到 Google Cloud Platform Console -> IAM & Admin -> 服務帳戶
  3. 創建具有相關權限的服務帳戶(google-cloud-storage)
  4. 下載 (JSON) 文件,並記住位置。
  5. 激活服務帳號
  6. 將文件下載為字符串並將該字符串傳遞給您的load_data(data)

希望有幫助!

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM