Due to very small traffic expected, dataflow with minimum resources is needed. The values needed are: 1 vCPU
, 1 GB
Memory
and 30 GB
Storage - Standard Persistent Disk
.
How can one create such a dataflow? What i have so far is the following:
DataflowPipelineOptions options = PipelineOptionsFactory.as(DataflowPipelineOptions.class);
options.setProject("project-id");
options.setRunner(DataflowRunner.class);
//Begin: Autoscalling --disable
options.setAutoscalingAlgorithm(DataflowPipelineWorkerPoolOptions.AutoscalingAlgorithmType.NONE);
options.setNumWorkers(1);
//End: Autoscalling
options.setStreaming(true);
options.setAppName("");
options.setMaxNumWorkers(1);
Where can one specify resources like vCPU
, Memory
and Storage - Standard Persistent Disk
in dataflow options?
I'm new to GCP
, any criticism is accepted
From the Javadocs
setDiskSizeGb
Remote worker disk size, in gigabytes, or 0 to use the default size.
And ...
setWorkerMachineType
Machine type to create Dataflow worker VMs as.
See GCE machine types for a list of valid options.
If unset, the Dataflow service will choose a reasonable default.
The allowed machine types are listed here , for your needs ("1vCPU, 1GB Memory") this one is the closest match: n1-standard-1
.
So, if you invoke the following methods on DataflowPipelineOptions
...
options.setDiskSizeGb(30);
options.setWorkerMachineType("n1-standard-1");
... then your dataflow workers will run on VM's with 1 CPU and 3.75GB of memory and they will use a storage disk of 30GB.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.