[英]Stream file from Google Cloud Storage
Here is a code to download File from Google Cloud Storage:这是从谷歌云存储下载文件的代码:
@Override
public void write(OutputStream outputStream) throws IOException {
try {
LOG.info(path);
InputStream stream = new ByteArrayInputStream(GoogleJsonKey.JSON_KEY.getBytes(StandardCharsets.UTF_8));
StorageOptions options = StorageOptions.newBuilder()
.setProjectId(PROJECT_ID)
.setCredentials(GoogleCredentials.fromStream(stream)).build();
Storage storage = options.getService();
final CountingOutputStream countingOutputStream = new CountingOutputStream(outputStream);
byte[] read = storage.readAllBytes(BlobId.of(BUCKET, path));
countingOutputStream.write(read);
} catch (Exception e) {
e.printStackTrace();
} finally {
outputStream.close();
}
}
This works but the problem here is that it has to buffer all the bytes first before it streams back to the client of this method.这可行,但这里的问题是它必须先缓冲所有字节,然后再流回该方法的客户端。 This is causing a lot of delays especially when the file stored in the GCS is big.
这会造成很多延迟,尤其是当 GCS 中存储的文件很大时。
Is there a way to get the File from GCS and stream it directly to the OutputStream , this OutputStream here btw is for a Servlet.有没有办法从 GCS 和stream 中直接获取文件到 OutputStream ,顺便说一句,这里的 OutputStream 是用于 Servlet 的。
Just to clarify, do you need an OutputStream
or an InputStream
? 只是为了澄清,你需要一个
OutputStream
或一个InputStream
吗? One way to look at this is that the data stored in Google Cloud Storage object as a file and you having an InputStream to read that file. 一种看待这种情况的方法是将存储在Google云端存储对象中的数据作为文件存储,并且您有一个InputStream来读取该文件。 If that works, read on.
如果可行,请继续阅读。
There is no existing method in Storage API which provides an InputStream
or an OutputStream
. Storage API中没有现有方法提供
InputStream
或OutputStream
。 But the there are 2 APIs in the Cloud Storage client library which expose a ReadChannel
object which is extended from ReadableByteChannel
(from java NIO API). 但是, 云存储客户端库中有2个API,它们公开了一个从
ReadableByteChannel
(来自java NIO API)扩展的ReadChannel
对象。
ReadChannel reader(String bucket, String blob, BlobSourceOption... options);
ReadChannel reader(BlobId blob, BlobSourceOption... options);
A simple example using this (taken from StorageSnippets.java ): 使用它的一个简单示例(取自StorageSnippets.java ):
/**
* Example of reading a blob's content through a reader.
*/
// [TARGET reader(String, String, BlobSourceOption...)]
// [VARIABLE "my_unique_bucket"]
// [VARIABLE "my_blob_name"]
public void readerFromStrings(String bucketName, String blobName) throws IOException {
// [START readerFromStrings]
try (ReadChannel reader = storage.reader(bucketName, blobName)) {
ByteBuffer bytes = ByteBuffer.allocate(64 * 1024);
while (reader.read(bytes) > 0) {
bytes.flip();
// do something with bytes
bytes.clear();
}
}
// [END readerFromStrings]
}
You can also use the newInputStream()
method to wrap an InputStream
over the ReadableByteChannel
. 您还可以使用
newInputStream()
方法将InputStream
包装在ReadableByteChannel
。
public static InputStream newInputStream(ReadableByteChannel ch)
Even if you need an OutputStream
, you should be able to copy data from the InputStream
or better from the ReadChannel
object into the OutputStream
. 即使您需要
OutputStream
,您也应该能够将InputStream
数据或更好的数据从ReadChannel
对象复制到OutputStream
。
Run this example as: PROGRAM_NAME <BUCKET_NAME> <BLOB_PATH>
将此示例运行为:
PROGRAM_NAME <BUCKET_NAME> <BLOB_PATH>
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.channels.Channels;
import java.nio.channels.WritableByteChannel;
import com.google.cloud.ReadChannel;
import com.google.cloud.storage.Bucket;
import com.google.cloud.storage.BucketInfo;
import com.google.cloud.storage.Storage;
import com.google.cloud.storage.StorageOptions;
/**
* An example which reads the contents of the specified object/blob from GCS
* and prints the contents to STDOUT.
*
* Run it as PROGRAM_NAME <BUCKET_NAME> <BLOB_PATH>
*/
public class ReadObjectSample {
private static final int BUFFER_SIZE = 64 * 1024;
public static void main(String[] args) throws IOException {
// Instantiates a Storage client
Storage storage = StorageOptions.getDefaultInstance().getService();
// The name for the GCS bucket
String bucketName = args[0];
// The path of the blob (i.e. GCS object) within the GCS bucket.
String blobPath = args[1];
printBlob(storage, bucketName, blobPath);
}
// Reads from the specified blob present in the GCS bucket and prints the contents to STDOUT.
private static void printBlob(Storage storage, String bucketName, String blobPath) throws IOException {
try (ReadChannel reader = storage.reader(bucketName, blobPath)) {
WritableByteChannel outChannel = Channels.newChannel(System.out);
ByteBuffer bytes = ByteBuffer.allocate(BUFFER_SIZE);
while (reader.read(bytes) > 0) {
bytes.flip();
outChannel.write(bytes);
bytes.clear();
}
}
}
}
Currently the cleanest option I could find looks like this: 目前我能找到的最干净的选项如下:
Blob blob = bucket.get("some-file");
ReadChannel reader = blob.reader();
InputStream inputStream = Channels.newInputStream(reader);
The Channels is from java.nio. 频道来自java.nio。 Furthermore you can then use commons io to easily read to InputStream into an OutputStream:
此外,您可以使用commons io轻松读取InputStream到OutputStream中:
IOUtils.copy(inputStream, outputStream);
Folks should be using Java 9 or above by now and so can use InputStream transferTo
the output stream:人们现在应该使用 Java 9 或更高版本,因此可以使用 InputStream
transferTo
到 output stream:
// the resource url is something like gs://youbucket/some/file/path.csv
public InputStream getUriAsInputStream( Storage storage, String resourceUri) {
String[] parts = resourceUri.split("/");
BlobId blobId = BlobId.of(parts[2], String.join("/", Arrays.copyOfRange(parts, 3, parts.length)));
Blob blob = storage.get(blobId);
if (blob == null || !blob.exists()) {
throw new IllegalArgumentException("Blob [" + resourceUri + "] does not exist");
}
ReadChannel reader = blob.reader();
InputStream inputStream = Channels.newInputStream(reader);
return inputStream;
}
// use it with something like:
@Override
public void write(OutputStream outputStream) throws IOException {
try {
LOG.info(path);
InputStream stream = new ByteArrayInputStream(GoogleJsonKey.JSON_KEY.getBytes(StandardCharsets.UTF_8));
StorageOptions options = StorageOptions.newBuilder()
.setProjectId(PROJECT_ID)
.setCredentials(GoogleCredentials.fromStream(stream)).build();
Storage storage = options.getService();
final CountingOutputStream countingOutputStream = new CountingOutputStream(outputStream);
final InputStream in = getUriAsInputStream(storage, "gs://your-bucket/path/to/file.csv");
in.transferTo(outputStream)
} catch (Exception e) {
e.printStackTrace();
} finally {
outputStream.close();
in.close();
}
}
Code, based on @Tuxdude answer 代码,基于@Tuxdude答案
@Nullable
public byte[] getFileBytes(String gcsUri) throws IOException {
Blob blob = getBlob(gcsUri);
ReadChannel reader;
byte[] result = null;
if (blob != null) {
reader = blob.reader();
InputStream inputStream = Channels.newInputStream(reader);
result = IOUtils.toByteArray(inputStream);
}
return result;
}
or 要么
//this will work only with files 64 * 1024 bytes on smaller
@Nullable
public byte[] getFileBytes(String gcsUri) throws IOException {
Blob blob = getBlob(gcsUri);
ReadChannel reader;
byte[] result = null;
if (blob != null) {
reader = blob.reader();
ByteBuffer bytes = ByteBuffer.allocate(64 * 1024);
while (reader.read(bytes) > 0) {
bytes.flip();
result = bytes.array();
bytes.clear();
}
}
return result;
}
helper code: 帮助代码:
@Nullable
Blob getBlob(String gcsUri) {
//gcsUri is "gs://" + blob.getBucket() + "/" + blob.getName(),
//example "gs://myapp.appspot.com/ocr_request_images/000c121b-357d-4ac0-a3f2-24e0f6d5cea185dffb40eee-850fab211438.jpg"
String bucketName = parseGcsUriForBucketName(gcsUri);
String fileName = parseGcsUriForFilename(gcsUri);
if (bucketName != null && fileName != null) {
return storage.get(BlobId.of(bucketName, fileName));
} else {
return null;
}
}
@Nullable
String parseGcsUriForFilename(String gcsUri) {
String fileName = null;
String prefix = "gs://";
if (gcsUri.startsWith(prefix)) {
int startIndexForBucket = gcsUri.indexOf(prefix) + prefix.length() + 1;
int startIndex = gcsUri.indexOf("/", startIndexForBucket) + 1;
fileName = gcsUri.substring(startIndex);
}
return fileName;
}
@Nullable
String parseGcsUriForBucketName(String gcsUri) {
String bucketName = null;
String prefix = "gs://";
if (gcsUri.startsWith(prefix)) {
int startIndex = gcsUri.indexOf(prefix) + prefix.length();
int endIndex = gcsUri.indexOf("/", startIndex);
bucketName = gcsUri.substring(startIndex, endIndex);
}
return bucketName;
}
Another (convenient) way to stream a file from Google Cloud Storage, with google-cloud-nio : 使用google-cloud-nio从Google云端存储流式传输文件的另一种(便捷)方式:
Path path = Paths.get(URI.create("gs://bucket/file.csv"));
InputStream in = Files.newInputStream(path);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.