简体   繁体   中英

Determine the number of characters used by double.Parse

This is a straightforward question, but I couldn't find any function that solves it. I need a way to determine how many characters were used to parse a double from a string.

I want to take the remainder of the string and use it to determine what measurement unit it is by doing a simple lookup in a table of symbol strings.


Update

I've awarded the answer to Olivier Jacot-Descombes, as he had the fullest Regex, and beat me to the punch with my own answer of how I'd use Regex. The only flaw in this answer I see is not accounting for the comma and dot swapping places with different cultures (which I did take account for in my answer, although it looks kinda messy).

However the actual solution I'll be implementing will not be using Regex. The reason that I've still awarded the answer is because essentially I was asking the wrong question. I think the Regex answer is the best solution for the question that I asked.

The solution I've come up with is to iterate over the available units and compare to the string using inputStr.EndsWith(unitStr) and when I get a positive match, I'll immediately know how long the number is by subtracting the length of the unit string from the test string, and then I can use double.Parse() with what's left (after a trim).

You can have Regex return the matches, so that you don't need two passes.

var parseNumUnit = new Regex(
 @"(?<num>(\+|-)?([0-9,]+(\.)?[0-9]*|[0-9,]*(\.)?[0-9]+)((e|E)(\+|-)?[0-9]+)?)\s*(?<unit>[a-zA-Z]*)"
);

Match match = parseNumUnit.Match("+13.234e-3m");
string number = match.Groups["num"].Value; // "+13.234e-3" 
string unit = match.Groups["unit"].Value; // "m"

Here

(?<name>expression)    captures the expression in a group named "name".

My regex for numbers is quite complex and allows number like "+13.234e-3" , "12.34" , ".25" , "10." or "23,503.14" . If your numbers have a simpler format, you can simplify the regex.

I suggest you use a RegEx, like this:

(?<double>[\d.]+)(?<unit>.*)

It will create two Groups when matched, ' double ' and ' unit ' containing the double value and the unit.

Example:

1.25632 meter

Here the Group double will contain '1.25632' and the Group unit will contain 'meter'

My current solution is to use Regex to interpret the floating point value and then retrieve the length to know where the unit starts.

    public static (double Value, string unit) Parse(string value)
    {
        var result = RegexParseDouble.Match(value);
        if(result.Success)
        {
            return (double.Parse(value.Substring(result.Length)), value.Substring(result.Length));
        }
        throw new FormatException("Value cannot be parsed as a floating point number.");
    }

    private static Regex RegexParseDouble
    {
        get => new Regex(
            @"^[-+]?(\d+" +
            Thread.CurrentThread.CurrentCulture.NumberFormat.NumberGroupSeparator +
            @"\d+)*\d*(" +
            Thread.CurrentThread.CurrentCulture.NumberFormat.NumberDecimalSeparator +
            @")?\d+([eE][-+]?\d+)?");
    }

Ideally I'd rather not have to parse the string myself, and then also have .NET parse the string again to provide the value.

A simple option that does not involve regular expressions:

var input = "42,666 towels";

// Get a char[] of all numbers or separators (',' or '.', depending on language):
var numericChars = input
                    .TakeWhile(c => c == ',' || c == '.' || Char.IsNumber(c))
                    .ToArray();

// Use the chars to init a new string, which can be parsed to a number: 
var nr = Double.Parse(new String(numericChars));

// ...and the remaining part of the original string is the unit:
// (Note that we use Trim() to remove any whitespace between the number and the unit)
var unit = input.Substring(numericChars.Count()).Trim();

// Outputs: Nr is 42,666, unit is towels.
Console.WriteLine($"Nr is {nr}, unit is {unit}.");

Update

As a response to the comment below, heres's an expansion. I'll admit this ruins some of the elegant simplicity above, but at least it's readable, configurable (expandable), and it works:

var nrFormat = System.Globalization.CultureInfo.CurrentCulture.NumberFormat;

// Remove or add strings to this list as needed:
var validStrings = 
    new List<string>{ 
                    nrFormat.NaNSymbol, 
                    nrFormat.NegativeSign, 
                    nrFormat.NumberDecimalSeparator, 
                    nrFormat.PercentGroupSeparator, 
                    nrFormat.PercentSymbol, 
                    nrFormat.PerMilleSymbol, 
                    nrFormat.PositiveInfinitySymbol, 
                    nrFormat.PositiveSign
                };

validStrings.AddRange(nrFormat.NativeDigits);
validStrings.Add("^");
validStrings.Add("e");
validStrings.Add("E");
validStrings.Add(" ");


// You can use more complex numbers, like: 
var input = "-42,666e-3 Towels";

// Get all numbers or separators (',' or '.', depending on language):
var numericChars = input.TakeWhile(c => validStrings.Contains("" + c)).ToArray();

// Use the chars to init a new string, which can be parsed to a number: 
var nr = Double.Parse(new String(numericChars));

// ...and the remaining part of the original string is the unit:
// (Note that we use Trim() to remove any whitespace between the number and the unit)
var unit = input.Substring(numericChars.Count()).Trim();

// Outputs is now: "Nr is -0,042666, unit is Towels"
Console.WriteLine($"Nr is {nr}, unit is {unit}.");

As you can see, the input can be a lot more complex now; you can even use something like var input = "∞ Garden Gnomes"; , which will produce the wonderfull output:

Nr is ∞, unit is Garden Gnomes.

Here's a non-Regex solution that occurred to me. If you can guarantee that your input will always be in the format number-space-unit, then you can simply do the following:

public static (double Value, string unit) Parse(string value)
{
    var values = value.Split(" ");

    double number;
    if (!double.TryParse(values[0], out number))
        throw new FormatException("Value cannot be parsed as a floating point number.");

    string unit = values[1];

    return (number, unit);
}

If your input string format is something else but consistent, you can do something similar to this to match that format.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM