简体   繁体   中英

Delphi - SysUtils.Trim not deleting last space(?) char

Delphi RIO. I have built an Excel PlugIn with Delphi (also using AddIn Express). I iterate through a column to read cell values. After I read the cell value, I do a TRIM function. The TRIM is not deleting the last space. Code Snippet...

acctName := Trim(UpperCase(Acctname));

Before the code, AcctName is 'ABC Holdings '. It is the same AFTER the TRIM function. It appears that Excel has added some type of other char there. (new line?? Carriage return??) What is the best way to get rid of this? Is there a way I can ask the debugger to show me the HEX value for this variable. I have tried the INSPECT and EVALUATE windows. They both just show text. Note that I have to be careful of just deleting NonText characters, and some companies names have dashes, commas, apostrophes, etc.

**Additional Info - Based on Andreas suggestion, I added the following...

ShowMessage(IntToHex(Ord(Acctname[Acctname.Length])));

This comes back with '00A0'. So I am thinking I can just do a simple StringReplace... so I add this BEFORE Andreas code...

 acctName := StringReplace(acctName, #13, '', [rfReplaceAll]);
 acctName := StringReplace(acctName, #10, '', [rfReplaceAll]);

Yet, it appears that nothing has changed. The ShowMessage still shows '00A0' as the last character. Why isn't the StringReplace removing this?

If you want to know the true identity of the last character of your string, you can display its Unicode codepoint:

ShowMessage(IntToHex(Ord(Acctname[Acctname.Length]))). 

Or, you can use a utility to investigate the Unicode character on the clipboard, like my own .


Yes, the character in question is U+00A0: NO-BREAK SPACE .

This is like a usual space, but it tells the rendering application not to put a line break at this space. For instance, in Swedish, at least, you want non-breaking spaces in 5 000 kWh .

By default, Trim and TStringHelper.Trim do not remove this kind of whitespace. (They also leave U+2007: FIGURE SPACE and a few other kinds of whitespace.)

The string helper method has an overload which lets you specify the characters to trim. You can use this to include U+00A0 :

S.Trim([#$20, #$A0, #$9, #$D, #$A]) // space, nbsp, tab, CR, LF
                                    // (many whitespace characters missing!)

But perhaps an even better solution is to rely on the Unicode characterisation and do

function RealTrimRight(const S: string): string;
var
  i: Integer;
begin
  i := S.Length;
  while (i > 0) and S[i].IsWhiteSpace do
    Dec(i);
  Result := Copy(S, 1, i);
end;

Of course, you can implement similar RealTrimLeft and RealTrim functions.


And of course there are many ways to see the actual string bytes in the debugger. In addition to writing things like Ord(S[S.Length]) in the Evaluate/Modify window ( Ctrl+F7 ), my personal favourite method is to use the Memory window ( Ctrl+Alt+E ). When this has focus, you can press Ctrl+G and type S[1] to see the actual bytes:

RAD Studio IDE 中内存窗口的屏幕截图。显示字符串及其标题和数据。

Here you see the string test string . Since strings are Unicode (UTF-16) since Delphi 2009, each character occupies two bytes. For simple ASCII characters, this means that every second byte is null. The ASCII values for our string are 74 65 73 74 20 73 74 72 69 6E 67 . You can also see, on the line above ( 02A0855C ) that our string object has reference count 1 and length B (=11).

As a demo, to show the unicode string:

program q63847533;

{$APPTYPE CONSOLE}

{$R *.res}

uses
  System.SysUtils;
type
  array100              = array[0..99] of Byte;
  parray100             = ^array100;
var
  searchResult          : TSearchRec;
  Name                  : string;
  display               : parray100 absolute Name;
  dummy                 : string;

begin
  if findfirst('z*.mp3', faAnyFile, searchResult) = 0 then
  begin
    repeat
      writeln('File name = '+searchResult.Name);
      name := searchResult.Name;
      writeln('File size = '+IntToStr(searchResult.Size));
    until FindNext(searchResult) <> 0;

    // Must free up resources used by these successful finds
    FindClose(searchResult);
  end;
  readln(dummy);
end.

My directory contains two z*.mp3 files, one with an ANSI name and the other with a Unicode name.

WATCHing display^ as Hex or Memorydump will display what you seem to require (the Is there a way I can ask the debugger to show me the HEX value for this variable. of your question)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM