简体   繁体   中英

JavaScript Unicode's length (astral symbols)

I have a < input type="text" > (in HTML) and everytime I add a character I do a if text.length < x {...} (in JavaScript).

The problem is that the Unicode special characters/astral symbols (\\u{.....}, the ones with more than 4 hex/ non-BMP characters) "are stored as two code units and so the length property will return 2 instead of 1."

( https://mixmax.com/blog/unicode-woes-in-javascript )

I wanna be able to get 1 for all symbols or 2, as long as it doesn't mix some with 1 and some with 2 because I have to have a working limit on the size of the visual text.

I think the solutions is here: https://mathiasbynens.be/notes/javascript-unicode#accounting-for-astral-symbols but I'm not sure how to use that.

My if is something like this:

if(document.getElementById("1").value.length<16){

Edit (it's working!):

<html>
    <head>
        <style>
            input{background:white;border:1px solid;height:30;outline-color:black;position:absolute;top:389;width:30}
        </style>
        <script>
            <!--
                function Add(symbol){
                    if (countSymbols(document.getElementById("1").value)<16) {
                        document.getElementById("1").value+=symbol}
                    if(document.getElementById("1").value.length==16 && document.getElementById("1").value=="\u{1F4BB}\u{1F3AE}\u{1F3C3}\u{1F525}\u2764\u{1D7CF}\u{1D7D1}\u{1F4B0}\u2757"){
                        document.getElementById("1").style.background="#00BB00"}
                    if(document.getElementById("1").value.length==16 && document.getElementById("1").value!="\u{1F4BB}\u{1F3AE}\u{1F3C3}\u{1F525}\u2764\u{1D7CF}\u{1D7D1}\u{1F4B0}\u2757"){
                        document.getElementById("1").style.background="#BB0000"}
                }
                function countSymbols(string) {
                    var regexAstralSymbols = /[\uD800-\uDBFF][\uDC00-\uDFFF]/g;
                    return string
                    // Replace every surrogate pair with a BMP symbol.
                    .replace(regexAstralSymbols, '_')
                    // …and *then* get the length.
                    .length;
                }
            //-->
        </script>
    </head>
    <body>
        <input readOnly="true" id="1" style="left:573;outline:0;padding:5 8;top:356;width:294">
        <input onclick="Add('\u{1F4BB}')" style="left:573" type="button" value="&#128187">
        <input onclick="Add('\u{1F3AE}')" style="left:606" type="button" value="&#127918">
        <input onclick="Add('\u{1F3C3}')" style="left:639" type="button" value="&#127939">
        <input onclick="Add('\u{1F525}')" style="left:672" type="button" value="&#128293">
        <input onclick="Add('\u2764')" style="left:705" type="button" value="&#10084">
        <input onclick="Add('\u{1D7CF}')" style="left:738" type="button" value="&#120783">
        <input onclick="Add('\u{1D7D1}')" style="left:771" type="button" value="&#120785">
        <input onclick="Add('\u{1F4B0}')" style="left:804" type="button" value="&#128176">
        <input onclick="Add('\u2757')" style="left:837" type="button" value="&#10071">
    </body>
</html>

I think you have most of the research done, you only need to put all of it together:

Taking the function that your link provides:

function countSymbols(string) {
    var regexAstralSymbols = /[\uD800-\uDBFF][\uDC00-\uDFFF]/g;
    return string
        // Replace every surrogate pair with a BMP symbol.
        .replace(regexAstralSymbols, '_')
        // …and *then* get the length.
        .length;
}

your if should be

if (countSymbols(document.getElementById("1").value)<16) { ...}

For example: countSymbols('🏃2🔥7') returns 4

Here you have a small example: https://jsfiddle.net/q7g9qtk7/

Update: You can also use Array.from (polyfilling for IE, Chrome and Firefox already support it), which takes a string and splits it into each character, no matter how long it is:

Array.from('🏃2🔥7') //returns ["🏃", "2", "🔥", "7"]

So your function could be

function countSymbols(string) {
       return Array.from(string).length;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM