简体   繁体   中英

UTF8 version of WIDESTRING

I have a text that I need to store it in a widestring variable. But my text is UTF8 and widestring doesn't support UTF8 and converts it to some chinese characters.

so is there any UTF8 version of WIDESTRING?

I always use UTF8string but in this case I have to use WideString

When you assign a UTF8String variable to a WideString variable, the compiler automatically inserts instructions to decode the string (in Delphi 2009 and later). It coverts UTF-8 to UTF-16, which is what WideString holds. If your WideString variable holds Chinese characters, then that's because your UTF-8-encoded string holds UTF-8-encoded Chinese characters.

If you want your string ws to hold 16-bit versions of the bytes in your UTF8String s , then you can by-pass the automatic conversion with some type-casting:

var
  ws: WideString;
  i: Integer;
  c: AnsiChar;

SetLength(ws, Length(s));
for i := 1 to Length(s) do begin
  c := s[i];
  ws[i] := WideChar(Ord(c));
end;

If you're using Delphi 2009 or later (which includes the XE series), then you should consider using UnicodeString instead of WideString . The former is a native Delphi type, whereas the latter is more of a wrapper for the Windows BSTR type. Both types exhibit the automatic conversion behavior when assigning to and from AnsiString derivatives like UTF8String , though, so they type you use doesn't affect this answer.


In earlier Delphi versions, the compiler would attempt to decode the string using the system code page (which is never UTF-8). To make it decode the string properly, call Utf8Decode :

ws := Utf8Decode(s);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM