简体   繁体   中英

Delphi: TStringList does not understand BOM?

Does TStringList not understand BOM?

Tf1 := TFileStream.Create(LIGALOG+'liga.log',fmOpenRead or fmShareDenyNone);

str:=tstringlist.Create;
str.LoadFromStream(tf1);

String1:='FStream '+inttostr(tf1.Size)+'/ String: '+(str.Text);

If a text file is saved in UTF-8 +BOM then Str.Count=0; Str.Text='' Str.Count=0; Str.Text='' . Without BOM all is OK.
Is it normal?

If you're using a version of Delphi prior to 2009, it doesn't support Unicode and the BOM is meaningless to TStringList.

If you're using D2009 or higher (which support Unicode), you can use the overloaded TStringList.LoadFromStream(Stream: TStream; Encoding: TEncoding) if you know ahead of time what the encoding is; if you don't, the RTL will try to figure it out using TEncoding.GetBufferEncoding . You can see the Delphi XE documentation on the topic here

If you don't know ahead of time, and the RTL isn't able to figure it out from the content, you can always read the BOM yourself from the stream, and then set the Stream.Position to just after the BOM and load the TStringList from that position with the decoding you determine yourself from that BOM.

Also, creating a TFileStream simply to then load into a TStringList is a waste; TStringList.LoadFromFile will handle the file itself, and is a lot less code if that's all you're going to do with the TStream .

EDIT: After your comment, I thought I'd include a list of the BOMs I'm familiar with - there may be more I'm not aware of:

$00 $00 $FE $FF  UTF-32, big-endian (bytes must be swapped for Windows)
$FE $FF $00 $00  UTF-32, little-endian
$FF $FE          UTF-16 2 byte chars little-endian
$FE $FF          UTF-16 2 byte big-endian 
$EF $BB $BF      Unicode UTF-8 (must be decoded before using Unicode data)

(For future reference: You should indicate in either the tags or the text of your question which version of Delphi you're using, as there are differences in the VCL and RTL between them. When it comes to things like Unicode/BOM type questions, these differences are extremely important.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM