简体繁体中英

How to use large volumes of data in Kubeflow?

原文 2019-04-12 14:12:10 1 2 kubernetes/ google-cloud-platform/ kubeflow

I have 1TB of images stored in GCS (data is splitted into 3 classes). I want to train custom Tensor Flow model on this data in Kubeflow. Currently, I have pipeline components for training and persisting the model but I don't know how to correctly feed this data into the classifier.

It seems to me like downloading this data from GCS (gsutil cp / something other) every time I run (possibly with fail) the pipeline is not a proper way to do this.

How to use large volumes of data in Kubeflow pipelines without downloading them every time? How to express access to this data using Kubeflow DSL?

2 answers

Additionally, if your data is in GCS, then TensorFlow supports the ability to access data in (and write to) GCS. The tf.data api lets you set up a performant data input pipeline.

Can you mount the volume on host machine?

If yes, mount the volume on host and then mount this directory to containers as hostPath so images are already mounted to node and whenever new container is up it can mount volume to container and start the process avoiding data transfer on each container startup.

How to use skaffold with volumes

How to use claims as Volumes

How to set Kubeflow to use preemptible VMs?

How to use Skaffold with kubernetes volumes?

How to schedule jobs in Kubeflow?

How to correctly use subdirs with projected volumes and configMaps

How to delete a Kubeflow cluster?

How to build an image for KubeFlow pipeline?

Kubeflow-kale :- How to integrate kubeflow-kale extension to run pipelines on a seperate standalone cluster of Kubeflow pipelines

How to use Kubernetes Persistent Local Volumes with Minikube on OSX?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question How to use skaffold with volumes How to use claims as Volumes How to set Kubeflow to use preemptible VMs? How to use Skaffold with kubernetes volumes? How to schedule jobs in Kubeflow? How to correctly use subdirs with projected volumes and configMaps How to delete a Kubeflow cluster? How to build an image for KubeFlow pipeline? Kubeflow-kale :- How to integrate kubeflow-kale extension to run pipelines on a seperate standalone cluster of Kubeflow pipelines How to use Kubernetes Persistent Local Volumes with Minikube on OSX?

Related Tags

How to use large volumes of data in Kubeflow?

Question

2 answers

solution1
2 2019-04-12 18:18:05

solution2
0 2019-04-12 16:00:02

How to use large volumes of data in Kubeflow?

Question

2 answers

solution1 2 2019-04-12 18:18:05

solution2 0 2019-04-12 16:00:02

solution1
2 2019-04-12 18:18:05

solution2
0 2019-04-12 16:00:02