简体   繁体   English

如何在PHP中检测文档的语言?

[英]How to detect the language of a document - in PHP?

The basics have already been answered here . 基本知识已经在这里得到解答。 But is there a pre-built PHP lib doing the same as Lingua::Identify from CPAN? 但是,是否有预建的PHP库与Lingua :: Identify的功能相同?

There's a PEAR package Text_LanguageDetect that I've used before. 我以前使用过一个PEAR包Text_LanguageDetect Get's the job done well enough. 做好工作吧。 I'm not sure of any other libs that are more mature. 我不确定其他任何更成熟的库。

1- You could do it yourself (the hard way) - detecting both language and codepage by looking at character and n-gram frequencies. 1-您可以自己做(困难的方式)-通过查看字符和n-gram频率来检测语言和代码页。 You would need lots of "training" data, but it's doable. 您将需要大量的“培训”数据,但这是可行的。

2- You could run a perl script to do the detection for you(much easier). 2-您可以运行perl脚本来为您进行检测(容易得多)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM