简体   繁体   English

正则表达式:如何匹配非ASCII字符串的所有大写字符

[英]Regex: How to match all uppercase characters of non-ascii string

I have strings like Japan Company , Chinese Company , and this regex /([AZ])/g to get all uppercase characters then joining the result to make an emission of them.我有像Japan CompanyChinese Company和这个正则表达式/([AZ])/g这样的字符串来获取所有大写字符,然后加入结果以发出它们。 But when the input string is not in English letters then the regex does not work.但是当输入字符串不是英文字母时,正则表达式不起作用。

let string = "Japan Company";
console.log(string.match(/[A-Z]/g).join('')); // JC

But when I have a string like日本の会社但是当我有一个像日本の会社这样的字符串时

let string = "日本の会社";
console.log(string.match(/[A-Z]/g).join(''));

This throws an exception as the result of the string.match(/[AZ]/g) is null .这会引发异常,因为string.match(/[AZ]/g)的结果是null

As I am trying to make elision to these strings and hieroglyphs do not have uppercases, the regex should match only first characters of each word where words are separated by spaces.当我试图省略这些字符串并且象形文字没有大写时,正则表达式应该只匹配每个单词的第一个字符,其中单词用空格分隔。

What generic regex should I use for this?我应该为此使用什么通用正则表达式?

Something like POSIX's [:upper:] but this does not work for JavaScript regex engine.类似于 POSIX 的[:upper:] ,但这不适用于 JavaScript 正则表达式引擎。

You can use您可以使用

(string.match(/(?<!\S)\S/g) || [string]).join('')

See the JavaScript demo:请参阅 JavaScript 演示:

 const strings = ["Japan Company", "japan company", "日本の会社"]; for (const string of strings) { console.log(string, '=>', (string.match(/(?<!\S)\S/g) || [string]).join('').toUpperCase()) }

The (?<!\S)\S regex matches a non-whitespace char at the start of string or after a whitespace char. (?<!\S)\S正则表达式匹配字符串开头或空白字符之后的非空白字符。

A Safari, non-lookbehind, pattern: Safari,非后视模式:

 var strings = ["Japan Company", "japan company", "日本の会社"]; for (var i=0; i<strings.length; i++) { var m = strings[i].match(/(?:^|\s)(\S)/g) if (m === null) { console.log(strings[i], '=> ', strings[i]) } else { console.log(strings[i], '=>', m.join('').replace(/\s+/g, '').toUpperCase()) } }

Based on your comments, I think this should do what you want.根据您的评论,我认为这应该做您想要的。

function getUppercaseLetters(string) {
    // Find uppercase letters in the string
    let matches = string.match(/[A-Z]/g);
    // If there are no uppercase letters, get the first letter of each word
    if (!matches) matches = string.split(" ").map(word => word[0]);
    // If there are still no matches, return an empty string. This should prevent against any edge cases.
    if (!matches) return "";
    // Join the array elements and make it uppercase.
    return matches.join("").toUpperCase();
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM