简体   繁体   English

C#string.IsNullOrWhiteSpace(“\\ t”)== true

[英]C# string.IsNullOrWhiteSpace(“\t”) == true

I have a line of code 我有一行代码

var delimiter = string.IsNullOrWhiteSpace(foundDelimiter) ? "," : foundDelimiter;

when foundDelimiter is "\\t" , string.IsNullOrWhiteSpace returns true. foundDelimiter"\\t" ,string.IsNullOrWhiteSpace返回true。

Why? 为什么? And what is the approriate way to work around this? 什么是解决这个问题的合适方式?

\\t is the tab character, which is whitespace. \\t是制表符,即空格。 In C# can do either of these to get a tab: 在C#中可以执行以下任一操作来获取选项卡:

var tab1 = "\t";
var tab2 = "    ";

var areEqual = tab1 == tab2; //returns true

Edit: As noted by Magus, SO is converting my tab character into spaces when the answer gets rendered. 编辑:正如Magus所说,当答案被渲染时,SO正在将我的制表符转换为空格。 If you're in your IDE you'd just hit quote, tab, quote. 如果你在IDE中,你只需点击引号,标签,引用。

As far as a workaround goes, I'd suggest you just add a check for tabs in your conditional. 就解决方法而言,我建议您只在条件中添加选项卡检查。

var delimiter = string.IsNullOrWhiteSpace(foundDelimiter) && foundDelimiter != "\t" ? "," : foundDelimiter;

Welcome to Unicode. 欢迎使用Unicode。

What did you expect would happen? 你期望会发生什么? HT (horizontal tab) has been a whitespace character for decades. HT(水平标签)几十年来一直是空白角色。 The "classic" C-language definition of white-space characters consists of the US-ASCII characters: 白色空格字符的“经典”C语言定义由US-ASCII字符组成:

  • SP : space (0x20, ' ' ) SP :空格(0x20, ' '
  • HT : horizontal tab (0x09, '\\t' ) HT :水平标签(0x09, '\\t'
  • LF : line feed (0x0A, '\\n' ) LF :换行(0x0A, '\\n'
  • VT : vertical tab (0x0B, '\\v' ) VT :垂直标签(0x0B, '\\v'
  • FF : vertical tab (0x0C, '\\f' ) FF :垂直制表符(0x0C, '\\f'
  • CR : carriage return (0x0C, '\\r' ) CR :回车(0x0C, '\\r'

Unicode is a little more...ecumenical in its approach: its definition of white-space characters is this set: Unicode在它的方法中更为一致......它的白色空间字符的定义是这样的:

  • Members of the Unicode category SpaceSeparator : Unicode类别SpaceSeparator的成员

    • SPACE (U+0020) SPACE (U + 0020)
    • OGHAM SPACE MARK (U+1680) OGHAM SPACE MARK (U + 1680)
    • MONGOLIAN VOWEL SEPARATOR (U+180E) MONGOLIAN VOWEL SEPARATOR (U + 180E)
    • EN QUAD (U+2000) EN QUAD (U + 2000)
    • EM QUAD (U+2001) EM QUAD (U + 2001)
    • EN SPACE (U+2002) EN SPACE (U + 2002)
    • EM SPACE (U+2003) EM SPACE (U + 2003)
    • THREE-PER-EM SPACE (U+2004) THREE-PER-EM SPACE (U + 2004)
    • FOUR-PER-EM SPACE (U+2005) FOUR-PER-EM SPACE (U + 2005)
    • SIX-PER-EM SPACE (U+2006) SIX-PER-EM SPACE (U + 2006)
    • FIGURE SPACE (U+2007) FIGURE SPACE (U + 2007)
    • PUNCTUATION SPACE (U+2008) PUNCTUATION SPACE (U + 2008)
    • THIN SPACE (U+2009) THIN SPACE (U + 2009)
    • HAIR SPACE (U+200A) HAIR SPACE (U + 200A)
    • NARROW NO-BREAK SPACE (U+202F) NARROW NO-BREAK SPACE (U + 202F)
    • MEDIUM MATHEMATICAL SPACE (U+205F) MEDIUM MATHEMATICAL SPACE (U + 205F)
    • IDEOGRAPHIC SPACE (U+3000) IDEOGRAPHIC SPACE (U + 3000)
  • Members of the Unicode category LineSeparator , which consists solely of Unicode类别LineSeparator的成员,仅由

    • LINE SEPARATOR (U+2028) LINE SEPARATOR (U + 2028)
  • Member of the Unicode category ParagraphSeparator , which consists solely of Unicode类别ParagraphSeparator的成员,仅由

    • PARAGRAPH SEPARATOR (U+2029) PARAGRAPH SEPARATOR (U + 2029)
  • These Basic Latin/C0 Controls/US-ASCII characters: 这些Basic Latin / C0控件/ US-ASCII字符:

    • CHARACTER TABULATION (U+0009) CHARACTER TABULATION (U + 0009)
    • LINE FEED (U+000A) LINE FEED (U + 000A)
    • LINE TABULATION (U+000B) LINE TABULATION (U + 000B)
    • FORM FEED (U+000C) FORM FEED (U + 000C)
    • CARRIAGE RETURN (U+000D) CARRIAGE RETURN (U + 000D)
  • These C1 Controls and Latin-1 Supplement characters 这些C1控件和Latin-1补充字符

    • NEXT LINE (U+0085) NEXT LINE (U + 0085)
    • NO-BREAK SPACE (U+00A0) NO-BREAK SPACE (U + 00A0)

If you don't like the definition, roll your own along these lines (plug in your own character set): 如果您不喜欢这个定义,请沿着这些行滚动(插入您自己的字符集):

public static bool IsNullOrCLanguageWhitespace( this string s )
{
  bool value = ( s == null || rxWS.IsMatch(s) ) ;
  return value ;
}
private static Regex rxWS = new Regex( @"^[ \t\n\v\f\r]*$") ;

You might want to add a char analog as well: 您可能还想添加char模拟:

public static bool IsCLanguageWhitespace( this char c )
{
  bool value ;
  switch ( c )
  {
  case ' '  : value = true  ; break ;
  case '\t' : value = true  ; break ;
  case '\n' : value = true  ; break ;
  case '\v' : value = true  ; break ;
  case '\f' : value = true  ; break ;
  case '\r' : value = true  ; break ;
  default   : value = false ; break ;
  }
  return  value ;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Java等价于c#String.IsNullOrEmpty()和String.IsNullOrWhiteSpace() - Java equivalent of c# String.IsNullOrEmpty() and String.IsNullOrWhiteSpace() .NET string.IsNullOrWhiteSpace实现 - .NET string.IsNullOrWhiteSpace implementation LINQ 表达式中的 String.IsNullOrWhiteSpace - String.IsNullOrWhiteSpace in LINQ Expression string.IsNullOrWhiteSpace()和string.IsNullOrEmpty()中的NullReferenceException - NullReferenceException in string.IsNullOrWhiteSpace() and string.IsNullOrEmpty() 为什么LINQ无法转换string.IsNullOrWhiteSpace()? - Why can LINQ not translate string.IsNullOrWhiteSpace()? string.IsNullOrEmpty(string)与string.IsNullOrWhiteSpace(string) - string.IsNullOrEmpty(string) vs. string.IsNullOrWhiteSpace(string) string.IsNullOrEmpty & string.IsNullOrWhiteSpace 为空字符串返回 false - string.IsNullOrEmpty & string.IsNullOrWhiteSpace return false for empty string C# 3.5 部分 class 字符串 IsNullOrWhiteSpace - C# 3.5 partial class String IsNullOrWhiteSpace .Net 3.5使用代码协定实现String.IsNullOrWhitespace - .Net 3.5 Implementation of String.IsNullOrWhitespace with Code Contracts string.IsNullOrEmpty(myString)或string.IsNullOrWhiteSpace(myString)是否违反了SRP规则? - string.IsNullOrEmpty(myString) or string.IsNullOrWhiteSpace(myString) is not violating SRP Rule?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM