简体   繁体   中英

C#: Extract number out of string, then change comma(,) to dot(.)

I'm using Visual Web Ripper to extract name and prices on products on a website.

When i extract the price from a table it comes in a form like this:

Kr. 129,30

I need to extract the 129,30, then turn the comma to a dot (129.30).

Visual Web Ripper can use scripts to modify the extracted content. It can use standard Regex, C# and VB.NET.

In the Regex tab I have found that

(\d+.)?(\d+)(.\d+)?

gives me 129,30, but then I can't change the comma into a dot.

Therefor I have to use C#. It comes with this standard script:

using System;
using VisualWebRipper.Internal.SimpleHtmlParser;
using VisualWebRipper;
public class Script
{
    //See help for a definition of WrContentTransformationArguments.
    public static string TransformContent(WrContentTransformationArguments args)
    {
        try
        {
            //Place your transformation code here.
            //This example just returns the input data
            return args.Content;
        }
        catch(Exception exp)
        {
            //Place error handling here
            args.WriteDebug("Custom script error: " + exp.Message);
            return "Custom script error";
        }
    }
}

How do I modify it to extract the number then replace the comma with a dot?

String.Replace is an option ( text.Replace(",", ".") ).

It would be better to properly parse number with correct CultureInfo and than reformat it back with InvariantCulture.

This is obviously Krona , so we should use the Swedish culture info to translate it. First we start with the input:

var original = "Kr. 129,30";

Get the culture:

using System.Globalization;
var culture = CultureInfo.GetCultureInfo("sv-SE");

This culture expects the currency string to be kr (case insensitive) but we have Kr. . So let's update it:

var format = (NumberFormatInfo)culture.NumberFormat.Clone();    
format.CurrencySymbol = "Kr.";    

And now the culture aware parse:

var number = Decimal.Parse(original, NumberStyles.Currency, format);

Now number contains a decimal that has been parsed correctly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM