简体   繁体   English

正则表达式无法区分 float 和 int 类型

[英]Regex can't differentiate between float and int types

I have written regexes for recognizing float and int but they don't seem to work (code below).我已经编写了用于识别 float 和 int 的正则表达式,但它们似乎不起作用(下面的代码)。

{
    string sumstring = "12.098";

    Regex flt = new Regex(@" ^[0-9]*(\.[0-9]*)");
    Regex ent = new Regex("^[0-9]+");

    if (d_type.IsMatch(sumstring))
    {
        Console.WriteLine(sumstring + " " + "dtype");
    }

    Match m = ent.Match(sumstring);

    if (m.Success)
    {
        Console.WriteLine("int");
    }
    else if (flt.IsMatch(sumstring))
    {
        Console.WriteLine("float");
    }
}

Where is the mistake?错误在哪里?

First, I don't think regular expressions are really the best tool for this job.首先,我不认为正则表达式真的是这项工作的最佳工具。 I would simply use the Double.TryParse() and Int32.TryParse() functions.我会简单地使用Double.TryParse()Int32.TryParse()函数。

Second, you're missing a whole lot of test cases with your regular expressions:其次,您的正则表达式遗漏了很多测试用例:

  • Integer Integer
    • 5 (covered) 5(覆盖)
    • +5 (not covered) +5(未覆盖)
    • -5 (not covered) -5(未覆盖)
  • Double双倍的
    • 5.0 (covered) 5.0(覆盖)
    • +5.0 (not covered) +5.0(未覆盖)
    • -5.0 (not covered) -5.0(未覆盖)
    • 5.0E5 (not covered) 5.0E5(未覆盖)
    • 5.0E+5 (not covered) 5.0E+5(未覆盖)
    • 5.0E-5 (not covered) 5.0E-5(未涵盖)
    • +5.0E5 (not covered) +5.0E5(未覆盖)
    • +5.0E+5 (not covered) +5.0E+5(未覆盖)
    • +5.0E-5 (not covered) +5.0E-5(未覆盖)
    • -5.0E5 (not covered) -5.0E5(未覆盖)
    • -5.0E+5 (not covered) -5.0E+5(未覆盖)
    • -5.0E-5 (not covered) -5.0E-5(未覆盖)
  • Edge Cases边缘案例
    • 2^32 + 1 (should be recognized as Double even though it looks like Integer) 2^32 + 1(即使看起来像 Integer,也应该被识别为 Double)

All of these (except maybe the edge case) would be immediately covered by using the library instead of hand-rolling a regex.所有这些(可能除了边缘情况)都将通过使用库而不是手动滚动正则表达式来立即涵盖。

You're trying your tests in the wrong order -- switch them, or (*) put a $ at the end of your RE patterns, to ensure they match all the way to the end.您正在以错误的顺序尝试测试 - 切换它们,或者 (*) 在 RE 模式的末尾放置一个 $,以确保它们一直匹配到最后。

(*) depends on what you're trying to do, exactly: match strings that start with the representation of an integer or float, or, only strings that are entirely composed of such a representation? (*) 完全取决于您要执行的操作:匹配以 integer 或浮点数的表示形式开头的字符串,或者仅匹配完全由此类表示形式组成的字符串?

The "ent" regex should be anchored: Regex ent = new Regex("^[0-9]+$"); “ent”正则表达式应该被锚定: Regex ent = new Regex("^[0-9]+$");

You were matching just the first numbers...你只匹配了第一个数字......

I don't know how compatible C#'s regular expressions are with Perl's, but I try not to reinvent the wheel unless it need reinventing:我不知道 C# 的正则表达式与 Perl 的兼容性如何,但我尽量不重新发明轮子,除非它需要重新发明:

% perl -e 'use Regexp::Common; print $RE{num}{real}, "\n"'
(?:(?i)(?:[+-]?)(?:(?=[0123456789]|[.])(?:[0123456789]*)(?:(?:[.])(?:[0123456789]{0,}))?)(?:(?:[E])(?:(?:[+-]?)(?:[0123456789]+))|))

Now, I don't get why they didn't use [0-9], but this works well.现在,我不明白他们为什么不使用 [0-9],但这很有效。

the regex should have to match the entire string.正则表达式必须匹配整个字符串。 "^\d*\.\d*$" would match. "^\d*\.\d*$"会匹配。 Alternatively you can just search for a period in the string.或者,您可以只搜索字符串中的句点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM