简体   繁体   English

确定double.Parse使用的字符数

[英]Determine the number of characters used by double.Parse

This is a straightforward question, but I couldn't find any function that solves it. 这是一个简单的问题,但我找不到任何解决它的函数。 I need a way to determine how many characters were used to parse a double from a string. 我需要一种方法来确定用于从字符串中解析double的字符数。

I want to take the remainder of the string and use it to determine what measurement unit it is by doing a simple lookup in a table of symbol strings. 我想获取字符串的其余部分,并通过在符号字符串表中进行简单查找来使用它来确定它是什么测量单位。


Update 更新

I've awarded the answer to Olivier Jacot-Descombes, as he had the fullest Regex, and beat me to the punch with my own answer of how I'd use Regex. 我已经给了Olivier Jacot-Descombes的答案,因为他拥有最完整的正则表达式,并且用我自己的答案打败了我,我将如何使用正则表达式。 The only flaw in this answer I see is not accounting for the comma and dot swapping places with different cultures (which I did take account for in my answer, although it looks kinda messy). 我看到这个答案中唯一的缺陷就是没有考虑到不同文化的逗号和点交换位置(我在答案中考虑到了这一点,尽管它看起来有点混乱)。

However the actual solution I'll be implementing will not be using Regex. 但是,我将实现的实际解决方案将不使用Regex。 The reason that I've still awarded the answer is because essentially I was asking the wrong question. 我仍然给出答案的原因是因为基本上我问的是错误的问题。 I think the Regex answer is the best solution for the question that I asked. 我认为正则表达式的答案是我提出的问题的最佳解决方案。

The solution I've come up with is to iterate over the available units and compare to the string using inputStr.EndsWith(unitStr) and when I get a positive match, I'll immediately know how long the number is by subtracting the length of the unit string from the test string, and then I can use double.Parse() with what's left (after a trim). 我提出的解决方案是迭代可用单位并使用inputStr.EndsWith(unitStr)与字符串进行比较,当我得到肯定匹配时,我会立即通过减去长度来知道该数字的长度来自测试字符串的单位字符串,然后我可以使用double.Parse()与左边的内容(修剪后)。

You can have Regex return the matches, so that you don't need two passes. 你可以让正则表达式返回匹配,这样你就不需要两次传球。

var parseNumUnit = new Regex(
 @"(?<num>(\+|-)?([0-9,]+(\.)?[0-9]*|[0-9,]*(\.)?[0-9]+)((e|E)(\+|-)?[0-9]+)?)\s*(?<unit>[a-zA-Z]*)"
);

Match match = parseNumUnit.Match("+13.234e-3m");
string number = match.Groups["num"].Value; // "+13.234e-3" 
string unit = match.Groups["unit"].Value; // "m"

Here 这里

(?<name>expression)    captures the expression in a group named "name".

My regex for numbers is quite complex and allows number like "+13.234e-3" , "12.34" , ".25" , "10." 我的数字正则表达式非常复杂,允许数字为"+13.234e-3""12.34"".25""10." or "23,503.14" . "23,503.14" If your numbers have a simpler format, you can simplify the regex. 如果您的数字格式更简单,则可以简化正则表达式。

I suggest you use a RegEx, like this: 我建议您使用RegEx,如下所示:

(?<double>[\d.]+)(?<unit>.*)

It will create two Groups when matched, ' double ' and ' unit ' containing the double value and the unit. 匹配时会创建两个组,' double '和' unit '包含double值和单位。

Example: 例:

1.25632 meter

Here the Group double will contain '1.25632' and the Group unit will contain 'meter' 这里集团double将包含“1.25632”及本集团unit将包含“米”

My current solution is to use Regex to interpret the floating point value and then retrieve the length to know where the unit starts. 我目前的解决方案是使用正则表达式来解释浮点值,然后检索长度以了解单位的起始位置。

    public static (double Value, string unit) Parse(string value)
    {
        var result = RegexParseDouble.Match(value);
        if(result.Success)
        {
            return (double.Parse(value.Substring(result.Length)), value.Substring(result.Length));
        }
        throw new FormatException("Value cannot be parsed as a floating point number.");
    }

    private static Regex RegexParseDouble
    {
        get => new Regex(
            @"^[-+]?(\d+" +
            Thread.CurrentThread.CurrentCulture.NumberFormat.NumberGroupSeparator +
            @"\d+)*\d*(" +
            Thread.CurrentThread.CurrentCulture.NumberFormat.NumberDecimalSeparator +
            @")?\d+([eE][-+]?\d+)?");
    }

Ideally I'd rather not have to parse the string myself, and then also have .NET parse the string again to provide the value. 理想情况下,我宁愿不必自己解析字符串,然后让.NET再次解析字符串以提供值。

A simple option that does not involve regular expressions: 一个不涉及正则表达式的简单选项:

var input = "42,666 towels";

// Get a char[] of all numbers or separators (',' or '.', depending on language):
var numericChars = input
                    .TakeWhile(c => c == ',' || c == '.' || Char.IsNumber(c))
                    .ToArray();

// Use the chars to init a new string, which can be parsed to a number: 
var nr = Double.Parse(new String(numericChars));

// ...and the remaining part of the original string is the unit:
// (Note that we use Trim() to remove any whitespace between the number and the unit)
var unit = input.Substring(numericChars.Count()).Trim();

// Outputs: Nr is 42,666, unit is towels.
Console.WriteLine($"Nr is {nr}, unit is {unit}.");

Update 更新

As a response to the comment below, heres's an expansion. 作为对下面评论的回应,继承人是一个扩张。 I'll admit this ruins some of the elegant simplicity above, but at least it's readable, configurable (expandable), and it works: 我承认这会破坏上面的优雅简约,但至少它是可读的,可配置的(可扩展的),它的工作原理如下:

var nrFormat = System.Globalization.CultureInfo.CurrentCulture.NumberFormat;

// Remove or add strings to this list as needed:
var validStrings = 
    new List<string>{ 
                    nrFormat.NaNSymbol, 
                    nrFormat.NegativeSign, 
                    nrFormat.NumberDecimalSeparator, 
                    nrFormat.PercentGroupSeparator, 
                    nrFormat.PercentSymbol, 
                    nrFormat.PerMilleSymbol, 
                    nrFormat.PositiveInfinitySymbol, 
                    nrFormat.PositiveSign
                };

validStrings.AddRange(nrFormat.NativeDigits);
validStrings.Add("^");
validStrings.Add("e");
validStrings.Add("E");
validStrings.Add(" ");


// You can use more complex numbers, like: 
var input = "-42,666e-3 Towels";

// Get all numbers or separators (',' or '.', depending on language):
var numericChars = input.TakeWhile(c => validStrings.Contains("" + c)).ToArray();

// Use the chars to init a new string, which can be parsed to a number: 
var nr = Double.Parse(new String(numericChars));

// ...and the remaining part of the original string is the unit:
// (Note that we use Trim() to remove any whitespace between the number and the unit)
var unit = input.Substring(numericChars.Count()).Trim();

// Outputs is now: "Nr is -0,042666, unit is Towels"
Console.WriteLine($"Nr is {nr}, unit is {unit}.");

As you can see, the input can be a lot more complex now; 如您所见,输入现在可能要复杂得多; you can even use something like var input = "∞ Garden Gnomes"; 你甚至可以使用像var input = "∞ Garden Gnomes"; , which will produce the wonderfull output: ,这将产生精彩的输出:

Nr is ∞, unit is Garden Gnomes. Nr是∞,单位是Garden Gnomes。

Here's a non-Regex solution that occurred to me. 这是我发现的非正则表达式解决方案。 If you can guarantee that your input will always be in the format number-space-unit, then you can simply do the following: 如果您可以保证您的输入始终采用格式number-space-unit,那么您可以简单地执行以下操作:

public static (double Value, string unit) Parse(string value)
{
    var values = value.Split(" ");

    double number;
    if (!double.TryParse(values[0], out number))
        throw new FormatException("Value cannot be parsed as a floating point number.");

    string unit = values[1];

    return (number, unit);
}

If your input string format is something else but consistent, you can do something similar to this to match that format. 如果您的输入字符串格式是其他但是一致的,您可以执行与此类似的操作以匹配该格式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM