简体   繁体   中英

Don't know how to transcribe wav file from Google Cloud Storage for LongRunningRecognize conversion to text in C#?

I'm able to convert audio files to text as long as they are under a minute. I need to transcribe longer files. Apparently, you have to have the file in Cloud Storage but I can't figure out if there is one command that does that or if I have to do it separately. What I'm using now is:

var credential = GoogleCredential.FromFile(GoogleCredentials);
var channel = new Grpc.Core.Channel(SpeechClient.DefaultEndpoint.ToString(), credential.ToChannelCredentials());
var speech = SpeechClient.Create(channel);

var response = speech.LongRunningRecognize(
    new RecognitionConfig()
        {
            Encoding = RecognitionConfig.Types.AudioEncoding.Linear16,
            LanguageCode = "en",
        },
        RecognitionAudio.FromFile(waveFile));

response = response.PollUntilCompleted();

I know I need to specify a file in Cloud Storage like:

RecognitionAudio.FromStorageUri("gs://ldn-speech/" + waveFile);

But I don't know how to get the file into the gs bucket. Do I have to do that in a separate step or as part of one of the Speech API's? I'm looking for someone to show me an example.

EDIT: I found that I needed to upload the file separately and could use the credential file I had already been using in the speech recognition process: So, all I needed was:

var credential = GoogleCredential.FromFile(GoogleCredentials);
var storage = StorageClient.Create(credential);
using (var f = File.OpenRead(fullFileName))
{
fileName = Path.GetFileName(fullFileName);
storage.UploadObject(bucketName, fileName, null);
}

There is also another method of going about in your case.

As stated in your edit you indeed needed to upload the file separately to your Cloud Storage bucket. If you are planning on transcribing long audio files (longer than 1 minute) to text you may consider using Asynchronous Speech recognition: https://cloud.google.com/speech-to-text/docs/async-recognize#speech-async-recognize-gcs-csharp

The code sample uses Cloud Storage bucket to store raw audio input for long-running transcription processes. It also requires that you have created and activated a service account.

Here's an example:

static object AsyncRecognizeGcs(string storageUri)
{
    var speech = SpeechClient.Create();
    var longOperation = speech.LongRunningRecognize(new RecognitionConfig()
    {
        Encoding = RecognitionConfig.Types.AudioEncoding.Linear16,
        SampleRateHertz = 16000,
        LanguageCode = "en",
    }, RecognitionAudio.FromStorageUri(storageUri));
    longOperation = longOperation.PollUntilCompleted();
    var response = longOperation.Result;
    foreach (var result in response.Results)
    {
        foreach (var alternative in result.Alternatives)
        {
            Console.WriteLine($"Transcript: { alternative.Transcript}");
        }
    }
    return 0;
}

(1) I found that I did indeed need to upload the file separately to cloud storate. (2) could use the credential file I had already been using in the speech recognition process: So, all I needed was:

var credential = GoogleCredential.FromFile(GoogleCredentials);
var storage = StorageClient.Create(credential);
using (var f = File.OpenRead(fullFileName))
{
  fileName = Path.GetFileName(fullFileName);
  storage.UploadObject(bucketName, fileName, null);
}

Once in Cloud storage, I could transcribe it as I originally thought. Then delete the file after the process was complete with:

var credential = GoogleCredentials;
var storage = StorageClient.Create(credential);
using (var f = File.OpenRead(fullFileName))
{
  fileName = Path.GetFileName(fullFileName);
  storage.DeleteObject(bucketName, fileName);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM