简体   繁体   English

Google将API文本翻译成语音-设置非罗马字符的编码

[英]Google Translate API Text to speech - setting encoding for non-roman characters

I'm using Google Translate's unofficial Text-to-speech API (I've posted more info on it here ). 我正在使用Google Translate的非官方的语音转换API(我在此处发布了更多信息)。

The API endpoint looks like: https://translate.google.com/translate_tts?ie=utf-8&tl=en&q=Hello%20World API端点如下所示: https://translate.google.com/translate_tts?ie=utf-8&tl=en&q=Hello%20World : https://translate.google.com/translate_tts?ie=utf-8&tl=en&q=Hello%20World

Making traditional API requests for words, I get No-access-control-origin and 404 blocks. 提出传统的单词API请求,我得到了No-access-control-origin和404块。 To get around this, I've followed the php script in this blog which strips out the referrer before making the request (more info on my attempts here ). 为了解决这个问题,我遵循了此博客中的php脚本,该脚本在发出请求之前去除了引荐来源(有关我在此处的尝试的更多信息)。

I'm able to get English to work, but I need this to work for Chinese. 我可以使用英语,但是我需要它来使用中文。 Unfortunately, when I pass in something like 你好, the voice seems to narrate gibberish. 不幸的是,当我传递类似“你好”的声音时,声音似乎在胡扯。 However, if you add this directly to your browser, it narrates perfectly. 但是,如果将其直接添加到浏览器中,则可以完美地叙述。

https://translate.google.com/translate_tts?ie=utf-8&tl=zh-CN&q=你好 https://translate.google.com/translate_tts?ie=utf-8&tl=zh-CN&q=你好

HTML : HTML

<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<meta http-equiv="content-type" content="text/html; charset=utf-8" />

<audio controls="controls" autoplay="autoplay" style="display:none;">
    <source src="testPHP.php?translate_tts?ie=utf-8&tl=zh-CN&q=你好" type="audio/mpeg" />
</audio>

testPHP.php : testPHP.php

<?php
//https://translate.google.com/translate_tts?ie=UTF-8&q=' + text + '&tl=en
header('Content-type: text/plain; charset=utf-8');
$params = http_build_query(array("ie" => $_GET['ie'],"tl" => $_GET["tl"], "q" => $_GET["q"]));
$ctx = stream_context_create(array("http"=>array("method"=>"GET","header"=>"Referer: \r\n"))); //create and return stream context
$soundfile = file_get_contents("https://translate.google.com/translate_tts?".$params, false, $ctx); //reads file into string (string with params[utf-8, tl, q], use include path bool, custom context resource headers)

header("Content-type: audio/mpeg");
header("Content-Transfer-Encoding: binary");
header('Pragma: no-cache');
header('Expires: 0');

echo($soundfile);

tail -f apache access_logs shows: tail -f apache access_logs显示:

GET /testPHP.php?translate_tts?ie=utf-8&tl=zh-CN&q=%E4%BD%A0%E5%A5%BD HTTP/1.1" 200 13536 GET /testPHP.php?translate_tts?ie=utf-8&tl=zh-CN&q=%E4%BD%A0%E5%A5%BD HTTP / 1.1“ 200 13536

This seems okay. 好像还可以 As you can see, the q query param value, 你好, has been converted. 如您所见, q查询参数值“好”已经转换。 This is fine because it still works if you put it in the browser: 这很好,因为如果将其放在浏览器中,它仍然可以工作:

https://translate.google.com/translate_tts?ie=utf-8&tl=zh-CN&q=%E4%BD%A0%E5%A5%BD https://translate.google.com/translate_tts?ie=utf-8&tl=zh-CN&q=%E4%BD%A0%E5%A5%BD

tail -f apache error_logs shows: tail -f apache error_logs显示:

PHP Notice: Undefined index: ie in /Users/danturcotte/Sites/personal_practice/melonJS-dev/testPHP.php on line 4, referer: http://melon.localhost/ PHP注意:未定义的索引:例如,在第4行的/Users/danturcotte/Sites/personal_practice/melonJS-dev/testPHP.php中,引用网址:http://melon.localhost/

I'm not sure how this is happening, or if it's contributing to screwing up the pronunciation. 我不确定这是怎么回事,或者它是否有助于搞乱发音。 I'm thinking that the words may be reading off parts of the ie index? 我认为这些单词可能正在读取部分ie索引?

The query params from the browser side seem to be registering, 浏览器方面的查询参数似乎正在注册,

在此处输入图片说明

And as you can see from the apache access_logs, ie=utf-8 param is being set fine. 从apache access_logs中可以看到, ie=utf-8参数已设置好。

So my questions are: 所以我的问题是:

  • I've added header('Content-type: text/plain; charset=utf-8'); 我添加了header('Content-type: text/plain; charset=utf-8'); to my testPHP.php file to ensure that the encoding is going through fine. 到我的testPHP.php文件,以确保编码正常进行。 Could this be contributing to the problem? 这会导致问题吗?

  • I'm building the URI query string as such: $params = http_build_query(array("ie" => $_GET['ie'],"tl" => $_GET["tl"], "q" => $_GET["q"])); 我正在这样构建URI查询字符串: $params = http_build_query(array("ie" => $_GET['ie'],"tl" => $_GET["tl"], "q" => $_GET["q"])); , so how can there be an undefined index ie ? ,那么怎么会有未定义的索引ie

The problem is in your URL: 问题出在您的URL:

GET /testPHP.php?translate_tts?ie=utf-8&tl=zh-CN&q=%E4%BD%A0%E5%A5%BD

You have two question marks, which means that PHP will get: 您有两个问号,这意味着PHP将获得:

Array 
( 
[translate_tts?ie] => utf-8 
[tl] => zh-CN 
[q] => 你好 
)

Instead you need to do something like: 相反,您需要执行以下操作:

GET /testPHP.php?translate_tts=value&ie=utf-8&tl=zh-CN&q=%E4%BD%A0%E5%A5%BD

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM