简体   繁体   中英

What is the best character for String.Split?

Disclaimer: I KNOW that in 99% of cases you shouldn't "serialize" data in a concatenated string.

What char you guys use in well-known situation:

string str = userId +"-"+ userName;

In majority of cases I have fallen back to | (pipe) but, in some cases users type even that. What about "non-typable" characters like ☼ (ALT+9999)?

That depends on too many factors to give a concrete answer.

Firstly, why are you doing this? If you feel the need to store the userId and userName by combining them in this fashion, consider alternative approaches, eg CSV-style quoting or similar.

Secondly, under normal circumstances only delimiters that aren't part of the strings should be used. If userId is just a number then "-" is fine... but what if the number could be negative?

Third, it depends on what you plan to do with the string. If it is simply for logging or debugger or some other form of human consumption then you can relax a bit about it, and just choose a delimiter that looks appropriate. If you plan to store data like this, use a delimiter than ensures you can extract the data properly later on, regardless of the values of userId or userName . If you can get away with it, use \\0 for example. If either value comes from an untrusted source (ie the Internet), then make sure the delimiter can't be used as a character in either string. Generally you would limit the characters that each contains - say, digits for userId and letters, digits and SOME punctuation characters for userName .

If it's for data storage and retrieval, there is no way to guarantee that a user won't find a way to inject your delimiter into the string. The safe thing to do is pre-process the input somehow:

  • Let - be the special character
  • If a - is encountered in the input, replace it with something like -0 .
  • Use -- as your delimiter

So userid = "alpha-dog" and userName = "papa--0bear" will be translated to

alpha-0dog--papa-0-00bear

The important thing is that your scheme needs to be perfectly undoable, and that the user shouldn't be able to break it, no matter what they enter.

Essentially this is a very primitive version of sanitization.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM