简体   繁体   中英

Use Web Speech API implementation in Google Chrome to circumvent rate limit of standard API?

To use Google's Speech API directly it is now required that you obtain an API key. To get that key you must subscribe to the chromium-dev@chromium.org newsgroup, and then follow a few steps and Google will give you a developer's key that is "not for distribution." The key is rate limited for 50 requests/day.

For example, node-google-speech-api outlines the need for having this key for a node application to access Google's Speech API directly (without the use of a browser): https://github.com/psirenny/node-google-speech-api

There are also PHP libraries and Java libraries for accessing Google's Speech API, also requiring this key.

I would like to write a desktop application that utilizes Google's speech recognition technology, but the 50 requests/day limit is unacceptable for wide distribution and even for a single desktop deployment of my envisioned software. I see up to 500 requests/day by an individual desktop user if the voice recognition is broken up somehow, and most of these would probably be long-polling/continuous so maybe it'd only be 2 or 3 requests/day but hours at a time. Multiply that by a few hundred users and I'd be easily exceeding 50 requests/day.

I was trying to think of a way to access Google's superior speech recognition technology on the desktop in my own app (language doesn't matter but node.js would likely be part of the mix so a node.js solution would be preferred) without this limit and that brought me to consider the Web Speech API standard which Google Chrome happens to implement.

As far as I know, there is not a hard request/day limit imposed on Google Chrome's implementation of the Web Speech API, and I could happily write websites that used Web Speech API all day long without or with minimal restrictions compared to Google Speech API direct. This brought me to thinking, what if I distributed a Chrome (not Chromium) browser, so the bonafide Google Chrome browser, but added an "extension" to it that allowed javascript within a custom html5 web page to interface with other applications on the client's system (ie a Node.js app running alongside this special installation of Chrome) and wrote my speech recognition portion in Javascript, Web Speech API style, and piped the output into the other application I design and have installed on clients' systems.

Would/could that work?

What are the pitfalls of this approach?

Do you have suggestions of another approach or would you perhaps recommend a commercially-licensed solution that is comparable to the ease of use and extreme natural language accuracy of Google's speech technology?

One possible approach to try is a Chrome App
It will run in a sandboxed instance of Chrome and will be implemented with HTML + Javascript.

To the user it will look just like a desktop application.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM