简体   繁体   中英

Temporary removal of HTML from string for Google Translate API to reduce cost

I have to translate some details using a Google API which we're paying for. The details contain HTML, and Google charges for each character. I don't want to send the complete content, but only the English text instead, with the HTML removed. I can remove HTML tags and entities using PHP functions, but I have to place the English content back in the HTML tags after translation for proper display. It will also include CSS.

Example:

<strong>This is a test</strong><br /> &nbsp; <custom tag>This is a test</custom tag><br />

After translation to Spanish I need:

<strong>Translated content </strong><br /> &nbsp; <p>Translated content </p><br />

How can I preserve the HTML format with out sending HTML to the API?

Haha, I also had that problem. But it has been while ago...

I think, there was a problem were - due to translation-nature - some sentenceparts were swaped. So I was not able to fit the tags in at the same position, first. But I think there was a way to get some metadata from the translationprocess, were you can see which part of the sentence have moved to a new position and what the content was... I know, I solved it finally. But I cant recall how :(

If every word takes the same place again after translation, you could first separate all words by whitespace OR htmltag into an array and remember where each HTML-tag was and reapply that after translation...

Add google translate service to your website and add notes witch words not to translate.

https://translate.google.com/manager/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM