简体   繁体   English

如何计算字符串中日文和拉丁文字符的数量?

[英]how to count number of japanese and latin characters in a string?

I need to write a function that properly follows Japanese counting of characters and therefore returns length of characters in a string given these conditions:我需要编写一个正确遵循日语字符计数的 function,因此在给定这些条件的情况下返回字符串中的字符长度:

  • 1 for Full-width char (Japanese kanji, katakana, and hiragana) 1 表示全角字符(日文汉字、片假名和平假名)
  • 0.5 for Half-width char (0-9, AZ).半角字符(0-9,AZ)为 0.5。

here is my unit test I wrote:这是我写的单元测试:

 describe('#getCaptionLength', () => { it('should return correct caption length for japanase', () => { const text = 'を取り外すコネクタと考えてください'; const result = getCaptionLength(text); expect(result).toBe(17) }); it('should return correct caption length for japanase mixed with latin', () => { const text = 'を取り外すコネクタと考えてください hello world'; const result = getCaptionLength(text); expect(result).toBe(17 + 6); }); });

Can you please help me write this function that would pass my unit test?你能帮我写这个可以通过我的单元测试的 function 吗? Thanks!谢谢!

It's not very complicated if you want to just pass these two tests you could write something like this:如果您只想通过这两个测试,这并不是很复杂,您可以编写如下内容:

 function getCaptionLength(text) { // find all latin characters with RexExp let re = new RegExp('[A-Za-z0-9 ]+', 'g'); let found = text.match(re); // get length of latin part let latinLength = found? found.join('').length: 0; // japanese part is just full string - latin part let japaneseCharactersLength = text.length - latinLength; // calculate and return the final result return japaneseCharactersLength + latinLength * 0.5; }

But this of course would also count as 1 everytihng that is not either Japanese character OR latin character such as emoji, special characters and what not.但这当然也算作 1 个不是日文字符或拉丁字符(如表情符号、特殊字符等等)的每个字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM