Why does String.fromCharCode(0xd800) to String.fromCharCode(0xdfff) return the replacement character?

Question

Why does this happen:

> String.fromCharCode(0xd7FF)
'퟿'
> String.fromCharCode(0xd800)
'�'
> String.fromCharCode(0xdffe) // (and everything in between)
'�'
> String.fromCharCode(0xdfff)
'�'
> String.fromCharCode(0xe000)
''

DFFF₁₆ is 55296₁₀. I get the same results with String.fromCodePoint() .

Answer 1

Code points U+D800 to U+DFFF are reserved for the UTF-16 encoding of surrogates . Effectively, these are characters which are never valid individually - they always come in surrogate pairs - a high surrogate followed by a low surrogate. (Confusingly, the "high surrogate" range is the range U+D800 to U+DBFF, and the "low surrogate" range is the range U+DC00 to U+DFFF.)

This pair of characters is combined in UTF-16 to represent a single character outside the Basic Multilingual Plane.

Outside this special meaning in UTF-16, these aren't valid characters. So it's reasonable for String.fromCharCode to basically say "you haven't provided valid string data" and use the Unicode replacement character instead.

Why does String.fromCharCode(0xd800) to String.fromCharCode(0xdfff) return the replacement character?

Question

1 answers

solution1
2 ACCPTED 2021-03-17 20:53:34

Why does String.fromCharCode(0xd800) to String.fromCharCode(0xdfff) return the replacement character?

Question

1 answers

solution1 2 ACCPTED 2021-03-17 20:53:34

solution1
2 ACCPTED 2021-03-17 20:53:34