简体   繁体   中英

Extract the occurrence of textual amount separately using C# regex

I have some paragraphs that contained the Textual Amounts multiple times (eg : Insurance with limits of not less than Five Hundred Thousand Dollars per person and One Million Dollars ) per occurrence insuring against all liability)

I have the following code that is responsible for extracting the numbers from the textual amount. Like (one thousand five hundred = 1500). It works fine if amount occurrence is only one but difficult when there is more than one occurrence of the amount in textual format. The numbers returned are not correct.

 private static Dictionary<string, long> numberTable = new Dictionary<string, long>
        { {"zero",0},{"one",1},{"two",2},{"three",3},{"four",4},
        {"five",5},{"six",6},{"seven",7},{"eight",8},{"nine",9},
        {"ten",10},{"eleven",11},{"twelve",12},{"thirteen",13},
        {"fourteen",14},{"fifteen",15},{"sixteen",16},
        {"seventeen",17},{"eighteen",18},{"nineteen",19},{"twenty",20},
        {"thirty",30},{"forty",40},{"fifty",50},{"sixty",60},
        {"seventy",70},{"eighty",80},{"ninety",90},{"hundred",100},
        {"thousand",1000},{"million",1000000},{"billion",1000000000},
        {"trillion",1000000000000},{"quadrillion",1000000000000000},
        {"quintillion",1000000000000000000}};


var numbers = Regex.Matches(numberString, @"\w+").Cast<Match>()
                .Select(m => m.Value.ToLowerInvariant())
                .Where(v => numberTable.ContainsKey(v)).Select(v => numberTable[v]);
            long acc = 0, total = 0L;
            foreach (var n in numbers)
            {
                if (n >= 1000)
                {
                    total += (acc * n);
                    acc = 0;
                }
                else if (n >= 100)
                {
                    acc *= n;
                }
                else acc += n;
            }
            string a = Convert.ToString((total + acc) * (numberString.StartsWith("minus", StringComparison.InvariantCultureIgnoreCase) ? -1 : 1));
            return a;

Can anyone help or give suggestion to solve the issue.

If I input only the text ="Insurance with limits of not less than Five Hundred Thousand Dollars" then the output is correct that us 500,000

But If i input the text = "Insurance with limits of not less than Five Hundred Thousand Dollars per person and One Million Dollars ) per occurrence insuring against all liability"

Then answer I get is 1500000 , But I need this separately like 50,000 and 1000,000

Note: I also have in my mind that if I can get all amounts to end up with dollars and convert them one by one. But I don't think that will be a good choice, I open for any kind of discussion. thanks

Change your code to below code :

Below we check that if the digit is continuation of the number (ie 5 100 1000 dollars) the we add(In Context Multiply) them into same element which is present in list( sample ) else we add the data into new Element of the list and repeat the process

Example :- Consider this -> If we remove all place from Five Hundred Thousand Dollars we will get FiveHundredThousandDollars Here we can see that the difference between e of five and h of Hundred is exactly 1 so we can conclude that number is same.

    public static List<long> ToLong(string numberString)
    {
        var numbers = Regex.Matches(numberString, @"\w+").Cast<Match>()
             .Select(m => m.Value.ToLowerInvariant())
             .Where(v => numberTable.ContainsKey(v))
             .Select(v => numberTable[v]);

        long acc = 0, total = 0L;
        List<long> sample = new List<long>();
        int prevIndex = 0, currIndex = 0;
        string currKey = "", prevKey = "";
        int i = 0;
        List<long> revList = numbers.ToList();
        revList.Reverse();
        foreach (var n in numbers)
        {
            numberString = numberString.Replace(" ", "");
            currKey = numberTable.FirstOrDefault(x => x.Value.ToString().ToLower() == n.ToString().ToLower()).Key;
            currIndex = numberString.ToLower().IndexOf(currKey.ToLower());
            bool isDiffNuber = !(prevIndex == 0 || currIndex - (prevIndex + prevKey.Length - 1) == 1);


            if (!isDiffNuber)
            {
                if (n >= 1000)
                {
                    total += (acc * n);
                    acc = 0;
                }
                else if (n >= 100)
                {
                    acc *= n;
                }
                else
                    acc += n;
            }

            if (isDiffNuber || numbers.Last() == n)
            {
                long val = total + acc;
                sample.Add(val);

                i++;
                prevIndex = 0;
                currIndex = 0;
                prevKey = "";
                currKey = "";
                total = 0;
                acc = 1;
            }


            prevIndex = currIndex;
            prevKey = currKey;
        }

        return sample;
    }

Note:- This solution will only work for above given sample if user adds Five Hundred "and" ....... then current sample will fail to do so.

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM