[英]How to detect the language of a document - in PHP?
There's a PEAR package Text_LanguageDetect
that I've used before. 我以前使用过一个PEAR包
Text_LanguageDetect
。 Get's the job done well enough. 做好工作吧。 I'm not sure of any other libs that are more mature.
我不确定其他任何更成熟的库。
1- You could do it yourself (the hard way) - detecting both language and codepage by looking at character and n-gram frequencies. 1-您可以自己做(困难的方式)-通过查看字符和n-gram频率来检测语言和代码页。 You would need lots of "training" data, but it's doable.
您将需要大量的“培训”数据,但这是可行的。
2- You could run a perl script to do the detection for you(much easier). 2-您可以运行perl脚本来为您进行检测(容易得多)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.